python-igraph API reference

module documentation

Classes related to graph clustering.

Class	`Clustering`	Class representing a clustering of an arbitrary ordered set.
Class	`CohesiveBlocks`	The cohesive block structure of a graph.
Class	`Cover`	Class representing a cover of an arbitrary ordered set.
Class	`Dendrogram`	The hierarchical clustering (dendrogram) of some dataset.
Class	`VertexClustering`	The clustering of the vertex set of a graph.
Class	`VertexCover`	The cover of the vertex set of a graph.
Class	`VertexDendrogram`	The dendrogram resulting from the hierarchical clustering of the vertex set of a graph.
Function	`compare_communities`	Compares two community structures using various distance measures.
Function	`split_join_distance`	Calculates the split-join distance between two community structures.
Function	`_biconnected_components`	Calculates the biconnected components of the graph.
Function	`_clusters`	Deprecated alias to `Graph.connected_components()`.
Function	`_cohesive_blocks`	Calculates the cohesive block structure of the graph.
Function	`_connected_components`	Calculates the (strong or weak) connected components for a given graph.
Function	`_handle_mark_groups_arg_for_clustering`	Handles the mark_groups=... keyword argument in plotting methods of clusterings.
Function	`_prepare_community_comparison`	Auxiliary method that takes two community structures either as membership lists or instances of `Clustering`, and returns a tuple whose two elements are membership lists.

def compare_communities(comm1, comm2, method='vi', remove_none=False): ¶

Compares two community structures using various distance measures.

Parameters
comm1	the first community structure as a membership list or as a `Clustering` object.
comm2	the second community structure as a membership list or as a `Clustering` object.
method	the measure to use. `"vi"` or `"meila"` means the variation of information metric of Meila (2003), `"nmi"` or `"danon"` means the normalized mutual information as defined by Danon et al (2005), `"split-join"` means the split-join distance of van Dongen (2000), `"rand"` means the Rand index of Rand (1971), `"adjusted_rand"` means the adjusted Rand index of Hubert and Arabie (1985).
remove_none	whether to remove `None` entries from the membership lists. This is handy if your `Clustering` object was constructed using `VertexClustering.FromAttribute` using an attribute which was not defined for all the vertices. If `remove_none` is `False`, a `None` entry in either `comm1` or `comm2` will result in an exception. If `remove_none` is `True`, `None` values are filtered away and only the remaining lists are compared.
Returns
the calculated measure.
Unknown Field: newfield
ref	Reference
Unknown Field: ref
Meila M: Comparing clusterings by the variation of information. In: Scholkopf B, Warmuth MK (eds). Learning Theory and Kernel Machines: 16th Annual Conference on Computational Learning Theory and 7th Kernel Workship, COLT/Kernel 2003, Washington, DC, USA. Lecture Notes in Computer Science, vol. 2777, Springer, 2003. ISBN: 978-3-540-40720-1.
Danon L, Diaz-Guilera A, Duch J, Arenas A: Comparing community structure identification. J Stat Mech P09008, 2005.
van Dongen D: Performance criteria for graph clustering and Markov cluster experiments. Technical Report INS-R0012, National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam, May 2000.
Rand WM: Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846-850, 1971.
Hubert L and Arabie P: Comparing partitions. Journal of Classification 2:193-218, 1985.

def split_join_distance(comm1, comm2, remove_none=False): ¶

Calculates the split-join distance between two community structures.

The split-join distance is a distance measure defined on the space of partitions of a given set. It is the sum of the projection distance of one partition from the other and vice versa, where the projection number of A from B is if calculated as follows:

For each set in A, find the set in B with which it has the maximal overlap, and take note of the size of the overlap.
Take the sum of the maximal overlap sizes for each set in A.
Subtract the sum from n, the number of elements in the partition.

Note that the projection distance is asymmetric, that's why it has to be calculated in both directions and then added together. This function returns the projection distance of comm1 from comm2 and the projection distance of comm2 from comm1, and returns them in a pair. The actual split-join distance is the sum of the two distances. The reason why it is presented this way is that one of the elements being zero then implies that one of the partitions is a subpartition of the other (and if it is close to zero, then one of the partitions is close to being a subpartition of the other).

Parameters
comm1	the first community structure as a membership list or as a `Clustering` object.
comm2	the second community structure as a membership list or as a `Clustering` object.
remove_none	whether to remove `None` entries from the membership lists. This is handy if your `Clustering` object was constructed using `VertexClustering.FromAttribute` using an attribute which was not defined for all the vertices. If `remove_none` is `False`, a `None` entry in either `comm1` or `comm2` will result in an exception. If `remove_none` is `True`, `None` values are filtered away and only the remaining lists are compared.
Returns
the projection distance of `comm1` from `comm2` and vice versa in a tuple. The split-join distance is the sum of the two.
See Also
`compare_communities()` with `method = "split-join"` if you are not interested in the individual projection distances but only the sum of them.
Unknown Field: newfield
ref	Reference
Unknown Field: ref
van Dongen D: Performance criteria for graph clustering and Markov cluster experiments. Technical Report INS-R0012, National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam, May 2000.

def _biconnected_components(graph, return_articulation_points=False): ¶

Calculates the biconnected components of the graph.

Parameters
graph	Undocumented
return_articulation_points	whether to return the articulation points as well
Returns
a `VertexCover` object describing the biconnected components, and optionally the list of articulation points as well

def _clusters(graph, mode='strong'): ¶

Deprecated alias to Graph.connected_components().

def _cohesive_blocks(graph): ¶

Calculates the cohesive block structure of the graph.

Cohesive blocking is a method of determining hierarchical subsets of graph vertices based on their structural cohesion (i.e. vertex connectivity). For a given graph G, a subset of its vertices S is said to be maximally k-cohesive if there is no superset of S with vertex connectivity greater than or equal to k. Cohesive blocking is a process through which, given a k-cohesive set of vertices, maximally l-cohesive subsets are recursively identified with l > k. Thus a hierarchy of vertex subsets is obtained in the end, with the entire graph G at its root.

Returns
an instance of `CohesiveBlocks`. See the documentation of `CohesiveBlocks` for more information.
See Also
`CohesiveBlocks`

def _connected_components(graph, mode='strong'): ¶

Calculates the (strong or weak) connected components for a given graph.

Parameters
graph	Undocumented
mode	must be either `"strong"` or `"weak"`, depending on the connected components being sought. Optional, defaults to `"strong"`.
Returns
a `VertexClustering` object

def _handle_mark_groups_arg_for_clustering(mark_groups, clustering): ¶

Handles the mark_groups=... keyword argument in plotting methods of clusterings.

This is an internal method, you shouldn't need to mess around with it. Its purpose is to handle the extended semantics of the mark_groups=... keyword argument in the __plot__ method of VertexClustering and VertexCover instances, namely the feature that numeric IDs are resolved to clusters automatically.

def _prepare_community_comparison(comm1, comm2, remove_none=False): ¶

Auxiliary method that takes two community structures either as membership lists or instances of Clustering, and returns a tuple whose two elements are membership lists.

This is used by compare_communities and split_join_distance.

Parameters
comm1	the first community structure as a membership list or as a `Clustering` object.
comm2	the second community structure as a membership list or as a `Clustering` object.
remove_none	whether to remove `None` entries from the membership lists. If `remove_none` is `False`, a `None` entry in either `comm1` or `comm2` will result in an exception. If `remove_none` is `True`, `None` values are filtered away and only the remaining lists are compared.

python-igraph API reference

igraph.clustering

`igraph.clustering`