easygraph.functions.community package

Submodules

easygraph.functions.community.LPA module

easygraph.functions.community.LPA.BMLPA(G, p)[source]

Detect community by Balanced Multi-Label Propagation algorithm

Return the detected communities.

Firstly, initialize ‘old’ using cores generated by RC function, the propagate label till the number and size of communities stay no change, check if there are subcommunity and delete it. Finally, split discontinuous communities.

For some directed graphs lead to oscillations of labels, modify the stop condition.

Parameters
  • G (graph) – A easygraph graph

  • p (float) – Between 0 and 1, judge Whether a community identifier should be retained

Returns

communities – key: serial number of community , value: nodes in the community.

Return type

dictionary

Examples

>>> BMLPA(G,
...     p = 0.1,
...     )

References

1

Wu Zhihao, Lin You-Fang, Gregory Steve, Wan Huai-Yu, Tian Sheng-Feng Balanced Multi-Label Propagation for Overlapping Community Detection in Social Networks

easygraph.functions.community.LPA.HANP(G, m, delta, threshod=1, hier_open=0, combine_open=0)[source]

Detect community by Hop attenuation & node preference algorithm

Return the detected communities. But the result is random.

Implement the basic HANP algorithm and give more freedom through the parameters, e.g., you can use threshod to set the condition for node updating. If network are known to be Hierarchical and overlapping communities, it’s recommended to choose geodesic distance as the measure(instead of receiving the current hop scores from the neighborhood and carry out a subtraction) and When an equilibrium is reached, treat newly combined communities as a single node.

For using Floyd to get the shortest distance, the time complexity is a little high.

Parameters
  • G (graph) – A easygraph graph

  • m (float) – Used to calculate score, when m > 0, more preference is given to node with more neighbors; m < 0, less

  • delta (float) – Hop attenuation

  • threshod (float) – Between 0 and 1, only update node whose number of neighbors sharing the maximal label is less than the threshod. e.g., threshod == 1 means updating all nodes.

  • hier_open – 1 means using geodesic distance as the score measure. 0 means not.

  • combine_open – this option is valid only when hier_open = 1 1 means When an equilibrium is reached, treat newly combined communities as a single node. 0 means not.

Returns

communities – key: serial number of community , value: nodes in the community.

Return type

dictionary

Examples

>>> HANP(G,
...     m = 0.1,
...     delta = 0.05,
...     threshod = 1,
...     hier_open = 0,
...     combine_open = 0
...     )

References

1

Ian X. Y. Leung, Pan Hui, Pietro Liò, and Jon Crowcrof: Towards real-time community detection in large networks

easygraph.functions.community.LPA.LPA(G)[source]

Detect community by label propagation algorithm Return the detected communities. But the result is random. Each node in the network is initially assigned to its own community. At every iteration,nodes have a label that the maximum number of their neighbors have. If there are more than one nodes fit and available, choose a label randomly. Finally, nodes having the same labels are grouped together as communities. In case two or more disconnected groups of nodes have the same label, we run a simple breadth-first search to separate the disconnected communities

Parameters

G (graph) – A easygraph graph

Returns

communities – key: serial number of community , value: nodes in the community.

Return type

dictionary

Examples

>>> LPA(G)

References

1

Usha Nandini Raghavan, Réka Albert, and Soundar Kumara: Near linear time algorithm to detect community structures in large-scale networks

easygraph.functions.community.LPA.SLPA(G, T, r)[source]

Detect Overlapping Communities by Speaker-listener Label Propagation Algorithm Return the detected Overlapping communities. But the result is random.

Parameters
  • G (graph) – A easygraph graph.

  • T (int) – The number of iterations, In general, T is set greater than 20, which produces relatively stable outputs.

  • r (int) – a threshold between 0 and 1.

Returns

communities – key: serial number of community , value: nodes in the community.

Return type

dictionary

Examples

>>> SLPA(G,
...     T = 20,
...     r = 0.05
...     )

References

1

Jierui Xie, Boleslaw K. Szymanski, Xiaoming Liu: SLPA: Uncovering Overlapping Communities in Social Networks via A Speaker-listener Interaction Dynamic Process

easygraph.functions.community.ego_graph module

easygraph.functions.community.ego_graph.ego_graph(G, n, radius=1, center=True, undirected=False, distance=None)[source]

Returns induced subgraph of neighbors centered at node n within a given radius.

Parameters
  • G (graph) – A EasyGraph Graph or DiGraph

  • n (node) – A single node

  • radius (number, optional) – Include all neighbors of distance<=radius from n.

  • center (bool, optional) – If False, do not include center node in graph

  • undirected (bool, optional) – If True use both in- and out-neighbors of directed graphs.

  • distance (key, optional) – Use specified edge data key as distance. For example, setting distance=’weight’ will use the edge weight to measure the distance from the node n.

Notes

For directed graphs D this produces the “out” neighborhood or successors. If you want the neighborhood of predecessors first reverse the graph with D.reverse(). If you want both directions use the keyword argument undirected=True.

Node, edge, and graph attributes are copied to the returned subgraph.

easygraph.functions.community.louvain module

easygraph.functions.community.louvain.louvain_communities(G, weight='weight', threshold=2e-05)[source]

Find the best partition of a graph using the Louvain Community Detection Algorithm.

Louvain Community Detection Algorithm is a simple method to extract the community structure of a network. This is a heuristic method based on modularity optimization. [1]_

The algorithm works in 2 steps. On the first step it assigns every node to be in its own community and then for each node it tries to find the maximum positive modularity gain by moving each node to all of its neighbor communities. If no positive gain is achieved the node remains in its original community.

The modularity gain obtained by moving an isolated node $i$ into a community $C$ can easily be calculated by the following formula (combining [1]_ [2]_ and some algebra):

\[\Delta Q = \frac{k_{i,in}}{2m} - \gamma\frac{ \Sigma_{tot} \cdot k_i}{2m^2}\]

where $m$ is the size of the graph, $k_{i,in}$ is the sum of the weights of the links from $i$ to nodes in $C$, $k_i$ is the sum of the weights of the links incident to node $i$, $Sigma_{tot}$ is the sum of the weights of the links incident to nodes in $C$ and $gamma$ is the resolution parameter.

For the directed case the modularity gain can be computed using this formula according to 3

\[\Delta Q = \frac{k_{i,in}}{m} - \gamma\frac{k_i^{out} \cdot\Sigma_{tot}^{in} + k_i^{in} \cdot \Sigma_{tot}^{out}}{m^2}\]

where $k_i^{out}$, $k_i^{in}$ are the outer and inner weighted degrees of node $i$ and $Sigma_{tot}^{in}$, $Sigma_{tot}^{out}$ are the sum of in-going and out-going links incident to nodes in $C$.

The first phase continues until no individual move can improve the modularity.

The second phase consists in building a new network whose nodes are now the communities found in the first phase. To do so, the weights of the links between the new nodes are given by the sum of the weight of the links between nodes in the corresponding two communities. Once this phase is complete it is possible to reapply the first phase creating bigger communities with increased modularity.

The above two phases are executed until no modularity gain is achieved (or is less than the threshold).

Parameters
  • threshold

  • G (easygraph) –

  • weight (string or None, optional (default="weight")) – The name of an edge attribute that holds the numerical value used as a weight. If None then each edge has weight 1.

Returns

A list of sets (partition of G). Each set represents one community and contains all the nodes that constitute it.

Return type

list

Notes

The order in which the nodes are considered can affect the final output. In the algorithm the ordering happens using a random shuffle.

References

1

Blondel, V.D. et al. Fast unfolding of communities in large networks. J. Stat. Mech 10008, 1-12(2008). https://doi.org/10.1088/1742-5468/2008/10/P10008

2

Traag, V.A., Waltman, L. & van Eck, N.J. From Louvain to Leiden: guaranteeing well-connected communities. Sci Rep 9, 5233 (2019). https://doi.org/10.1038/s41598-019-41695-z

3

Nicolas Dugu��, Anthony Perez. Directed Louvain : maximizing modularity in directed networks. [Research Report] Universit�� d��Orl��ans. 2015. hal-01231784. https://hal.archives-ouvertes.fr/hal-01231784

easygraph.functions.community.louvain.louvain_partitions(G, weight='weight', threshold=1e-07)[source]

Yields partitions for each level of the Louvain Community Detection Algorithm

Louvain Community Detection Algorithm is a simple method to extract the community structure of a network. This is a heuristic method based on modularity optimization. [1]_

The partitions at each level (step of the algorithm) form a dendogram of communities. A dendrogram is a diagram representing a tree and each level represents a partition of the G graph. The top level contains the smallest communities and as you traverse to the bottom of the tree the communities get bigger and the overall modularity increases making the partition better.

Each level is generated by executing the two phases of the Louvain Community Detection Algorithm.

Parameters
  • threshold

  • G (easygraph) –

  • weight (string or None, optional (default="weight")) – The name of an edge attribute that holds the numerical value used as a weight. If None then each edge has weight 1.

Yields

list – A list of sets (partition of G). Each set represents one community and contains all the nodes that constitute it.

References

1

Blondel, V.D. et al. Fast unfolding of communities in large networks. J. Stat. Mech 10008, 1-12(2008)

easygraph.functions.community.modularity module

easygraph.functions.community.modularity.modularity(G, communities, weight='weight')[source]

Returns the modularity of the given partition of the graph. Modularity is defined in [1]_ as

\[Q = \frac{1}{2m} \sum_{ij} \left( A_{ij} - \frac{k_ik_j}{2m}\right) \delta(c_i,c_j)\]

where m is the number of edges, A is the adjacency matrix of G,

\[k_i\ is\ the\ degree\ of\ i\ and\ \delta(c_i, c_j)\ is\ 1\ if\ i\ and\ j\ are\ in\ the\ same\ community\ and\ 0\ otherwise.\]
Parameters
  • G (easygraph.Graph or easygraph.DiGraph) –

  • communities (list or iterable of set of nodes) – These node sets must represent a partition of G’s nodes.

  • weight (string, optional (default : 'weight')) – The key for edge weight.

Returns

Q – The modularity of the partition.

Return type

float

References

1

M. E. J. Newman Networks: An Introduction, page 224. Oxford University Press, 2011.

easygraph.functions.community.modularity_max_detection module

easygraph.functions.community.modularity_max_detection.greedy_modularity_communities(G, weight='weight')[source]

Communities detection via greedy modularity method.

Find communities in graph using Clauset-Newman-Moore greedy modularity maximization. This method currently supports the Graph class.

Greedy modularity maximization begins with each node in its own community and joins the pair of communities that most increases modularity until no such pair exists.

Parameters
  • G (easygraph.Graph or easygraph.DiGraph) –

  • weight (string (default : 'weight')) – The key for edge weight. For undirected graph, it will regard each edge weight as 1.

Return type

Yields sets of nodes, one for each community.

References

1

Newman, M. E. J. “Networks: An Introduction Oxford Univ.” (2010).

2

Clauset, Aaron, Mark EJ Newman, and Cristopher Moore.

“Finding community structure in very large networks.” Physical review E 70.6 (2004): 066111.

easygraph.functions.community.motif module

easygraph.functions.community.motif.enumerate_subgraph(G, k: int)[source]

Returns the motifs. Motifs are small weakly connected induced subgraphs of a given structure in a graph.

Parameters
  • G (easygraph.Graph or easygraph.DiGraph.) –

  • k (int) – The size of the motifs to search for.

Returns

k_subgraphs – The motifs.

Return type

list

References

1

Wernicke, Sebastian. “Efficient detection of network motifs.” IEEE/ACM transactions on computational biology and bioinformatics 3.4 (2006): 347-359.

easygraph.functions.community.motif.random_enumerate_subgraph(G, k: int, cut_prob: list)[source]

Returns the motifs. Motifs are small weakly connected induced subgraphs of a given structure in a graph.

Parameters
  • G (easygraph.Graph or easygraph.DiGraph.) –

  • k (int) – The size of the motifs to search for.

  • cut_prob (list) – list of probabilities for cutting the search tree at a given level.

Returns

k_subgraphs – The motifs.

Return type

list

References

1

Wernicke, Sebastian. “A faster algorithm for detecting network motifs.”

International Workshop on Algorithms in Bioinformatics. Springer, Berlin, Heidelberg, 2005.

Module contents