Newer
Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
# recurrent-segmentation-sbm
Reccurent segmentation meets block models in temporal networks
# Overview of the code
The file `sbm_core.py` is the main part of the code, which contains utility functions for main algorithms.(K-segmentation algorithm and two variants of (K-H) segmentation algorithm).
The class `optimize.py` contains code of three main algorithms. It uses the functions contained in `sbm_core.py`.
The estimated intensity functions, groups/clusters and change-points can be obtained from calling functions in `optimize.py`.
The file `utils.py` contains the utility code to read data, SMAWK utilities, creating networkx graph object etc.
The file `experiments.py` contains required functions to run the simulations.
# Experimental files and Running the code
```
num_roles=1
num_segments=1
num_levels=1
algo_ver= 3
```
The files `bikes_santander.py` , `collegeMsg.py` , `bitcoin.py` , `eu_email_dep1.py`,`eu_email_dep2.py`, `mathoverflow.py`, and `mooc.py`
provide examples on real world dynamic networks.
This zip folder consists experiments for both for Synthetic and real-world datasets.
The data we have used is publicly available.
There are four main user parameters. Example is as follows.
```
num_roles=2
num_segments=3
num_levels=2
algo_ver= 3
dest_folder='./Results/bikes/'
```
algo_ver `3` is dedicated to (K-H)-segmentation algorithm using SMAWK.
algo_ver `2` is dedicated to (K-H)-segmentation algorithm using naive segmentation.
There are two tuning parameters. Example is as follows.
```
theta = 1e-5
eta = 1
```
All the experimental files can be run directly to do the simulations.
From the file `synthetic_experiment_1.py` to `synthetic_experiment_5.py` contain code to simulate synthetic datasets using our algorithms.
These files contain the code to generate synthetic data as well.
The files `likelihood_vs_H_bikes.py`,`likelihood_vs_H_bitcoin.py`,`likelihood_vs_H_dep2.py` return normalized likelihood values for
a set of given `H` levels. eg: `current_h` = 1,2,3,4,5,6,7,8,...20;
To reproduce the results of the paper just run `synthetic_experiment_likelihood_vs_H_?.py`files.
Note that Normalized log-likelihood is the ratio between a particular likelihood and
the likelihoodvalue which corresponds to a single group and a single segment.
The file `bike_times_edges_large.py` returns the running time and edges for given a fraction of edges.
To reproduce the results of the paper, eg. set `_frac` = .4, .5, .8 or 1; one at a time. Note that running time can be dependent on the machine you run.
To switch the algorithm you choose, change `algo_ver` parameter to either 2 or 3.
The file `synthetic_experiment_time_vs_edges.py` returns the running time and edges for given `NO_SAMPLES`.
To reproduce the results of the paper, eg. set `NO_SAMPLES` = 50 from the list_samples = [50,100,150,200] ; one at a time.
Level-dependent Membership extension:
New algo_ver `4` is dedicated to level-dependent (K-H)-segmentation algorithm variant.
From the file `ld-1-syn.py` to `ld-5-syn.py` contain code to simulate synthetic datasets using our level-dependent algorithms.
These files contain the code to generate synthetic data as well.
The files `ld-bikes_santander.py` , `ld-bitcoin.py` , `ld-eu_email_dep1.py`,`ld-eu_email_dep2.py`, `mathoverflow.py`, and `mooc.py`
provide examples on real world dynamic networks for level-dependent variant.
## References
SMAWK code: https://www.ics.uci.edu/~eppstein/PADS/SMAWK.py