easygraph.functions.community package
Submodules
easygraph.functions.community.LPA module
- easygraph.functions.community.LPA.BMLPA(G, p)[source]
Detect community by Balanced Multi-Label Propagation algorithm
Return the detected communities.
Firstly, initialize ‘old’ using cores generated by RC function, the propagate label till the number and size of communities stay no change, check if there are subcommunity and delete it. Finally, split discontinuous communities.
For some directed graphs lead to oscillations of labels, modify the stop condition.
- Parameters
G (graph) – A easygraph graph
p (float) – Between 0 and 1, judge Whether a community identifier should be retained
- Returns
communities – key: serial number of community , value: nodes in the community.
- Return type
dictionary
Examples
>>> BMLPA(G, ... p = 0.1, ... )
References
- 1
Wu Zhihao, Lin You-Fang, Gregory Steve, Wan Huai-Yu, Tian Sheng-Feng Balanced Multi-Label Propagation for Overlapping Community Detection in Social Networks
- easygraph.functions.community.LPA.HANP(G, m, delta, threshod=1, hier_open=0, combine_open=0)[source]
Detect community by Hop attenuation & node preference algorithm
Return the detected communities. But the result is random.
Implement the basic HANP algorithm and give more freedom through the parameters, e.g., you can use threshod to set the condition for node updating. If network are known to be Hierarchical and overlapping communities, it’s recommended to choose geodesic distance as the measure(instead of receiving the current hop scores from the neighborhood and carry out a subtraction) and When an equilibrium is reached, treat newly combined communities as a single node.
For using Floyd to get the shortest distance, the time complexity is a little high.
- Parameters
G (graph) – A easygraph graph
m (float) – Used to calculate score, when m > 0, more preference is given to node with more neighbors; m < 0, less
delta (float) – Hop attenuation
threshod (float) – Between 0 and 1, only update node whose number of neighbors sharing the maximal label is less than the threshod. e.g., threshod == 1 means updating all nodes.
hier_open – 1 means using geodesic distance as the score measure. 0 means not.
combine_open – this option is valid only when hier_open = 1 1 means When an equilibrium is reached, treat newly combined communities as a single node. 0 means not.
- Returns
communities – key: serial number of community , value: nodes in the community.
- Return type
dictionary
Examples
>>> HANP(G, ... m = 0.1, ... delta = 0.05, ... threshod = 1, ... hier_open = 0, ... combine_open = 0 ... )
References
- 1
Ian X. Y. Leung, Pan Hui, Pietro Liò, and Jon Crowcrof: Towards real-time community detection in large networks
- easygraph.functions.community.LPA.LPA(G)[source]
Detect community by label propagation algorithm Return the detected communities. But the result is random. Each node in the network is initially assigned to its own community. At every iteration,nodes have a label that the maximum number of their neighbors have. If there are more than one nodes fit and available, choose a label randomly. Finally, nodes having the same labels are grouped together as communities. In case two or more disconnected groups of nodes have the same label, we run a simple breadth-first search to separate the disconnected communities
- Parameters
G (graph) – A easygraph graph
- Returns
communities – key: serial number of community , value: nodes in the community.
- Return type
dictionary
Examples
>>> LPA(G)
References
- 1
Usha Nandini Raghavan, Réka Albert, and Soundar Kumara: Near linear time algorithm to detect community structures in large-scale networks
- easygraph.functions.community.LPA.SLPA(G, T, r)[source]
Detect Overlapping Communities by Speaker-listener Label Propagation Algorithm Return the detected Overlapping communities. But the result is random.
- Parameters
G (graph) – A easygraph graph.
T (int) – The number of iterations, In general, T is set greater than 20, which produces relatively stable outputs.
r (int) – a threshold between 0 and 1.
- Returns
communities – key: serial number of community , value: nodes in the community.
- Return type
dictionary
Examples
>>> SLPA(G, ... T = 20, ... r = 0.05 ... )
References
- 1
Jierui Xie, Boleslaw K. Szymanski, Xiaoming Liu: SLPA: Uncovering Overlapping Communities in Social Networks via A Speaker-listener Interaction Dynamic Process
easygraph.functions.community.ego_graph module
- easygraph.functions.community.ego_graph.ego_graph(G, n, radius=1, center=True, undirected=False, distance=None)[source]
Returns induced subgraph of neighbors centered at node n within a given radius.
- Parameters
G (graph) – A EasyGraph Graph or DiGraph
n (node) – A single node
radius (number, optional) – Include all neighbors of distance<=radius from n.
center (bool, optional) – If False, do not include center node in graph
undirected (bool, optional) – If True use both in- and out-neighbors of directed graphs.
distance (key, optional) – Use specified edge data key as distance. For example, setting distance=’weight’ will use the edge weight to measure the distance from the node n.
Notes
For directed graphs D this produces the “out” neighborhood or successors. If you want the neighborhood of predecessors first reverse the graph with D.reverse(). If you want both directions use the keyword argument undirected=True.
Node, edge, and graph attributes are copied to the returned subgraph.
easygraph.functions.community.louvain module
- easygraph.functions.community.louvain.louvain_communities(G, weight='weight', threshold=2e-05)[source]
Find the best partition of a graph using the Louvain Community Detection Algorithm.
Louvain Community Detection Algorithm is a simple method to extract the community structure of a network. This is a heuristic method based on modularity optimization. [1]_
The algorithm works in 2 steps. On the first step it assigns every node to be in its own community and then for each node it tries to find the maximum positive modularity gain by moving each node to all of its neighbor communities. If no positive gain is achieved the node remains in its original community.
The modularity gain obtained by moving an isolated node $i$ into a community $C$ can easily be calculated by the following formula (combining [1]_ [2]_ and some algebra):
\[\Delta Q = \frac{k_{i,in}}{2m} - \gamma\frac{ \Sigma_{tot} \cdot k_i}{2m^2}\]where $m$ is the size of the graph, $k_{i,in}$ is the sum of the weights of the links from $i$ to nodes in $C$, $k_i$ is the sum of the weights of the links incident to node $i$, $Sigma_{tot}$ is the sum of the weights of the links incident to nodes in $C$ and $gamma$ is the resolution parameter.
For the directed case the modularity gain can be computed using this formula according to 3
\[\Delta Q = \frac{k_{i,in}}{m} - \gamma\frac{k_i^{out} \cdot\Sigma_{tot}^{in} + k_i^{in} \cdot \Sigma_{tot}^{out}}{m^2}\]where $k_i^{out}$, $k_i^{in}$ are the outer and inner weighted degrees of node $i$ and $Sigma_{tot}^{in}$, $Sigma_{tot}^{out}$ are the sum of in-going and out-going links incident to nodes in $C$.
The first phase continues until no individual move can improve the modularity.
The second phase consists in building a new network whose nodes are now the communities found in the first phase. To do so, the weights of the links between the new nodes are given by the sum of the weight of the links between nodes in the corresponding two communities. Once this phase is complete it is possible to reapply the first phase creating bigger communities with increased modularity.
The above two phases are executed until no modularity gain is achieved (or is less than the threshold).
- Parameters
threshold –
G (easygraph) –
weight (string or None, optional (default="weight")) – The name of an edge attribute that holds the numerical value used as a weight. If None then each edge has weight 1.
- Returns
A list of sets (partition of G). Each set represents one community and contains all the nodes that constitute it.
- Return type
list
Notes
The order in which the nodes are considered can affect the final output. In the algorithm the ordering happens using a random shuffle.
References
- 1
Blondel, V.D. et al. Fast unfolding of communities in large networks. J. Stat. Mech 10008, 1-12(2008). https://doi.org/10.1088/1742-5468/2008/10/P10008
- 2
Traag, V.A., Waltman, L. & van Eck, N.J. From Louvain to Leiden: guaranteeing well-connected communities. Sci Rep 9, 5233 (2019). https://doi.org/10.1038/s41598-019-41695-z
- 3
Nicolas Dugu��, Anthony Perez. Directed Louvain : maximizing modularity in directed networks. [Research Report] Universit�� d��Orl��ans. 2015. hal-01231784. https://hal.archives-ouvertes.fr/hal-01231784
See also
- easygraph.functions.community.louvain.louvain_partitions(G, weight='weight', threshold=1e-07)[source]
Yields partitions for each level of the Louvain Community Detection Algorithm
Louvain Community Detection Algorithm is a simple method to extract the community structure of a network. This is a heuristic method based on modularity optimization. [1]_
The partitions at each level (step of the algorithm) form a dendogram of communities. A dendrogram is a diagram representing a tree and each level represents a partition of the G graph. The top level contains the smallest communities and as you traverse to the bottom of the tree the communities get bigger and the overall modularity increases making the partition better.
Each level is generated by executing the two phases of the Louvain Community Detection Algorithm.
- Parameters
threshold –
G (easygraph) –
weight (string or None, optional (default="weight")) – The name of an edge attribute that holds the numerical value used as a weight. If None then each edge has weight 1.
- Yields
list – A list of sets (partition of G). Each set represents one community and contains all the nodes that constitute it.
References
- 1
Blondel, V.D. et al. Fast unfolding of communities in large networks. J. Stat. Mech 10008, 1-12(2008)
See also
easygraph.functions.community.modularity module
- easygraph.functions.community.modularity.modularity(G, communities, weight='weight')[source]
Returns the modularity of the given partition of the graph. Modularity is defined in [1]_ as
\[Q = \frac{1}{2m} \sum_{ij} \left( A_{ij} - \frac{k_ik_j}{2m}\right) \delta(c_i,c_j)\]where m is the number of edges, A is the adjacency matrix of G,
\[k_i\ is\ the\ degree\ of\ i\ and\ \delta(c_i, c_j)\ is\ 1\ if\ i\ and\ j\ are\ in\ the\ same\ community\ and\ 0\ otherwise.\]- Parameters
G (easygraph.Graph or easygraph.DiGraph) –
communities (list or iterable of set of nodes) – These node sets must represent a partition of G’s nodes.
weight (string, optional (default : 'weight')) – The key for edge weight.
- Returns
Q – The modularity of the partition.
- Return type
float
References
- 1
M. E. J. Newman Networks: An Introduction, page 224. Oxford University Press, 2011.
easygraph.functions.community.modularity_max_detection module
- easygraph.functions.community.modularity_max_detection.greedy_modularity_communities(G, weight='weight')[source]
Communities detection via greedy modularity method.
Find communities in graph using Clauset-Newman-Moore greedy modularity maximization. This method currently supports the Graph class.
Greedy modularity maximization begins with each node in its own community and joins the pair of communities that most increases modularity until no such pair exists.
- Parameters
G (easygraph.Graph or easygraph.DiGraph) –
weight (string (default : 'weight')) – The key for edge weight. For undirected graph, it will regard each edge weight as 1.
- Return type
Yields sets of nodes, one for each community.
References
- 1
Newman, M. E. J. “Networks: An Introduction Oxford Univ.” (2010).
- 2
Clauset, Aaron, Mark EJ Newman, and Cristopher Moore.
“Finding community structure in very large networks.” Physical review E 70.6 (2004): 066111.
easygraph.functions.community.motif module
- easygraph.functions.community.motif.enumerate_subgraph(G, k: int)[source]
Returns the motifs. Motifs are small weakly connected induced subgraphs of a given structure in a graph.
- Parameters
G (easygraph.Graph or easygraph.DiGraph.) –
k (int) – The size of the motifs to search for.
- Returns
k_subgraphs – The motifs.
- Return type
list
References
- 1
Wernicke, Sebastian. “Efficient detection of network motifs.” IEEE/ACM transactions on computational biology and bioinformatics 3.4 (2006): 347-359.
- easygraph.functions.community.motif.random_enumerate_subgraph(G, k: int, cut_prob: list)[source]
Returns the motifs. Motifs are small weakly connected induced subgraphs of a given structure in a graph.
- Parameters
G (easygraph.Graph or easygraph.DiGraph.) –
k (int) – The size of the motifs to search for.
cut_prob (list) – list of probabilities for cutting the search tree at a given level.
- Returns
k_subgraphs – The motifs.
- Return type
list
References
- 1
Wernicke, Sebastian. “A faster algorithm for detecting network motifs.”
International Workshop on Algorithms in Bioinformatics. Springer, Berlin, Heidelberg, 2005.