igraph Reference Manual

For using the igraph C library

Search the manual:

Chapter 19. Graph motifs, dyad census and triad census

This section deals with functions which find small induced subgraphs in a graph. These were first defined for subgraphs of two and three vertices by Holland and Leinhardt, and named dyad census and triad census.

1. igraph_dyad_census — Dyad census, as defined by Holland and Leinhardt.

igraph_error_t igraph_dyad_census(const igraph_t *graph, igraph_real_t *mut,
                       igraph_real_t *asym, igraph_real_t *null);

Dyad census means classifying each pair of vertices of a directed graph into three categories: mutual (there is at least one edge from a to b and also from b to a); asymmetric (there is at least one edge either from a to b or from b to a, but not the other way) and null (no edges between a and b in either direction).

Holland, P.W. and Leinhardt, S. (1970). A Method for Detecting Structure in Sociometric Data. American Journal of Sociology, 70, 492-513.

Arguments: 

graph:

The input graph. For an undirected graph, there are no asymmetric connections.

mut:

Pointer to a real, the number of mutual dyads is stored here.

asym:

Pointer to a real, the number of asymmetric dyads is stored here.

null:

Pointer to a real, the number of null dyads is stored here.

Returns: 

Error code.

See also: 

Time complexity: O(|V|+|E|), the number of vertices plus the number of edges.

2. igraph_triad_census — Triad census, as defined by Davis and Leinhardt.

igraph_error_t igraph_triad_census(const igraph_t *graph, igraph_vector_t *res);

Calculating the triad census means classifying every triple of vertices in a directed graph based on the type of pairwise connections it contains, i.e. mutual, asymmetric or no connection. A triple can be in one of 16 states, commonly described using Davis and Leinhardt's "MAN labels". The res vector will contain the counts of these in the following order:

 0: 003

A, B, C, the empty graph.

 1: 012

A->B, C, a graph with a single directed edge.

 2: 102

A<->B, C, a graph with a mutual connection between two vertices.

 3: 021D

A<-B->C, the binary out-tree.

 4: 021U

A->B<-C, the binary in-tree.

 5: 021C

A->B->C, the directed line.

 6: 111D

A<->B<-C.

 7: 111U

A<->B->C.

 8: 030T

A->B<-C, A->C.

 9: 030C

A<-B<-C, A->C.

10: 201

A<->B<->C.

11: 120D

A<-B->C, A<->C.

12: 120U

A->B<-C, A<->C.

13: 120C

A->B->C, A<->C.

14: 210

A->B<->C, A<->C.

15: 300

A<->B<->C, A<->C, the complete graph.

This function is intended for directed graphs. If the input is undirected, a warning is shown, and undirected edges will be interpreted as mutual.

This function calls igraph_motifs_randesu() which is an implementation of the FANMOD motif finder tool, see igraph_motifs_randesu() for details. Note that the order of the triads is not the same for igraph_triad_census() and igraph_motifs_randesu().

References:

Davis, J.A. and Leinhardt, S. (1972). The Structure of Positive Interpersonal Relations in Small Groups. In J. Berger (Ed.), Sociological Theories in Progress, Volume 2, 218-251. Boston: Houghton Mifflin.

Arguments: 

graph:

The input graph.

res:

Pointer to an initialized vector, the result is stored here in the same order as given in the list above. Note that this order is different than the one used by igraph_motifs_randesu().

Returns: 

Error code.

See also: 

Time complexity: TODO.

3. Finding triangles

3.1. igraph_adjacent_triangles — Count the number of triangles a vertex is part of.

igraph_error_t igraph_adjacent_triangles(const igraph_t *graph,
                              igraph_vector_t *res,
                              const igraph_vs_t vids);

Arguments: 

graph:

The input graph. Edge directions and multiplicities are ignored.

res:

Initiliazed vector, the results are stored here.

vids:

The vertices to perform the calculation for.

Returns: 

Error mode.

See also: 

igraph_list_triangles() to list them.

Time complexity: O(d^2 n), d is the average vertex degree of the queried vertices, n is their number.

3.2. igraph_list_triangles — Find all triangles in a graph.

igraph_error_t igraph_list_triangles(const igraph_t *graph,
                          igraph_vector_int_t *res);

The triangles are reported as a long list of vertex ID triplets. Use the int variant of igraph_matrix_view_from_vector() to create a matrix view into the vector where each triangle is stored in a column of the matrix (see the example).

Arguments: 

graph:

The input graph, edge directions are ignored. Multiple edges are ignored.

res:

Pointer to an initialized integer vector, the result is stored here, in a long list of triples of vertex IDs. Each triple is a triangle in the graph. Each triangle is listed exactly once.

Returns: 

Error code.

See also: 

igraph_transitivity_undirected() to count the triangles, igraph_adjacent_triangles() to count the triangles a vertex participates in.

Time complexity: O(d^2 n), d is the average degree, n is the number of vertices.

Example 19.1.  File examples/simple/igraph_list_triangles.c

#include <igraph.h>

int main(void) {
    igraph_t g;
    igraph_vector_int_t v;
    igraph_matrix_int_t result;

    igraph_full(&g, 5, 0, IGRAPH_NO_LOOPS);

    printf("Triangles in a full graph of 5 vertices:\n");
    igraph_vector_int_init(&v, 0);
    igraph_list_triangles(&g, &v);
    igraph_matrix_int_view_from_vector(&result, &v, /* nrow = */ 3);
    igraph_matrix_int_print(&result);
    igraph_vector_int_destroy(&v);
    igraph_destroy(&g);

    return 0;
}


4. Graph motifs

4.1. igraph_motifs_randesu — Count the number of motifs in a graph.

igraph_error_t igraph_motifs_randesu(const igraph_t *graph, igraph_vector_t *hist,
                          igraph_integer_t size, const igraph_vector_t *cut_prob);

Motifs are small weakly connected induced subgraphs of a given structure in a graph. It is argued that the motif profile (i.e. the number of different motifs in the graph) is characteristic for different types of networks and network function is related to the motifs in the graph.

This function is able to find directed motifs of sizes three and four and undirected motifs of sizes three to six (i.e. the number of different subgraphs with three to six vertices in the network).

In a big network the total number of motifs can be very large, so it takes a lot of time to find all of them. In this case, a sampling method can be used. This function is capable of doing sampling via the cut_prob argument. This argument gives the probability that a branch of the motif search tree will not be explored. See S. Wernicke and F. Rasche: FANMOD: a tool for fast network motif detection, Bioinformatics 22(9), 1152--1153, 2006 for details. https://doi.org/10.1093/bioinformatics/btl038

Set the cut_prob argument to a zero vector for finding all motifs.

Directed motifs will be counted in directed graphs and undirected motifs in undirected graphs.

Arguments: 

graph:

The graph to find the motifs in.

hist:

The result of the computation, it gives the number of motifs found for each isomorphism class. See igraph_isoclass() for help about isomorphism classes. Note that this function does not count isomorphism classes that are not connected and will report NaN (more precisely IGRAPH_NAN) for them.

size:

The size of the motifs to search for. For directed graphs, only 3 and 4 are implemented, for undirected, 3 to 6. The limitation is not in the motif finding code, but the graph isomorphism code.

cut_prob:

Vector of probabilities for cutting the search tree at a given level. The first element is the first level, etc. Supply all zeros here (of length size) to find all motifs in a graph.

Returns: 

Error code.

See also: 

igraph_motifs_randesu_estimate() for estimating the number of motifs in a graph, this can help to set the cut_prob parameter; igraph_motifs_randesu_no() to calculate the total number of motifs of a given size in a graph; igraph_motifs_randesu_callback() for calling a callback function for every motif found; igraph_subisomorphic_lad() for finding subgraphs on more than 4 (directed) or 6 (undirected) vertices; igraph_graph_count() to find the number of graph on a given number of vertices, i.e. the length of the hist vector.

Time complexity: TODO.

Example 19.2.  File examples/simple/igraph_motifs_randesu.c

#include <igraph.h>

/* This is a callback function suitable for use with igraph_motifs_randesu_callback().
 * It prints each motif it is calld with. */
igraph_error_t print_motif(const igraph_t *graph, igraph_vector_int_t *vids,
                          igraph_integer_t isoclass, void* extra) {
    printf("Found isoclass %2" IGRAPH_PRId ":  ", isoclass);
    igraph_vector_int_print(vids);
    return IGRAPH_SUCCESS; /* Return 'IGRAPH_SUCCESS': do not interrupt the search. */
}

int main(void) {

    igraph_t graph;
    igraph_vector_t hist;
    igraph_real_t zeros[] = { 0.0, 0.0, 0.0, 0.0, 0.0 };
    igraph_vector_t cut_prob;

    /* Compute the 4-motif distritbuion in Zachary's karate club network. */

    igraph_famous(&graph, "Zachary");
    igraph_vector_init(&hist, 0);

    igraph_motifs_randesu(&graph, &hist, 4, igraph_vector_view(&cut_prob, zeros, 4));

    /* Compute the total number of motifs (connected 4-vertex subgraphs)
     * so that we can print the normalized distribution. */
    igraph_real_t sum = 0.0;
    igraph_integer_t n = igraph_vector_size(&hist);
    for (igraph_integer_t i=0; i < n; i++) {
        if (!isnan(VECTOR(hist)[i])) {
            sum += VECTOR(hist)[i];
        }
    }
    printf("4-motif distribution:\n");
    for (igraph_integer_t i=0; i < n; i++) {
        /* Print NaN values in a platform-independent manner: */
        igraph_real_printf(VECTOR(hist)[i] / sum);
        printf(" ");
    }
    printf("\n\n");

    igraph_vector_destroy(&hist);
    igraph_destroy(&graph);

    /* Identify the vertices of each three-motif in a small Kautz graph. */

    igraph_kautz(&graph, 2, 1);
    igraph_motifs_randesu_callback(&graph, 3, igraph_vector_view(&cut_prob, zeros, 3), &print_motif, NULL);
    igraph_destroy(&graph);

    return 0;
}


4.2. igraph_motifs_randesu_no — Count the total number of motifs in a graph.

igraph_error_t igraph_motifs_randesu_no(const igraph_t *graph, igraph_integer_t *no,
                             igraph_integer_t size, const igraph_vector_t *cut_prob);

This function counts the total number of (weakly) connected induced subgraphs on size vertices, without assigning isomorphism classes to them. Arbitrarily large motif sizes are supported.

Arguments: 

graph:

The graph object to study.

no:

Pointer to an integer type, the result will be stored here.

size:

The size of the motifs to count.

cut_prob:

Vector giving the probabilities that a branch of the search tree will be cut at a given level.

Returns: 

Error code.

See also: 

Time complexity: TODO.

4.3. igraph_motifs_randesu_estimate — Estimate the total number of motifs in a graph.

igraph_error_t igraph_motifs_randesu_estimate(const igraph_t *graph, igraph_integer_t *est,
                                   igraph_integer_t size, const igraph_vector_t *cut_prob,
                                   igraph_integer_t sample_size,
                                   const igraph_vector_int_t *parsample);

This function estimates the total number of (weakly) connected induced subgraphs on size vertices. For example, an undirected complete graph on n vertices will have one motif of size n, and n motifs of size n - 1. As another example, one triangle and a separate vertex will have zero motifs of size four.

This function is useful for large graphs for which it is not feasible to count all connected subgraphs, as there are too many of them.

The estimate is made by taking a sample of vertices and counting all connected subgraphs in which these vertices are included. There is also a cut_prob parameter which gives the probabilities to cut a branch of the search tree.

Arguments: 

graph:

The graph object to study.

est:

Pointer to an integer, the result will be stored here.

size:

The size of the subgraphs to look for.

cut_prob:

Vector giving the probabilities to cut a branch of the search tree and omit counting the motifs in that branch. It contains a probability for each level. Supply size zeros here to count all the motifs in the sample.

sample_size:

The number of vertices to use as the sample. This parameter is only used if the parsample argument is a null pointer.

parsample:

Either pointer to an initialized vector or a null pointer. If a vector then the vertex IDs in the vector are used as a sample. If a null pointer then the sample_size argument is used to create a sample of vertices drawn with uniform probability.

Returns: 

Error code.

See also: 

Time complexity: TODO.

4.4. igraph_motifs_randesu_callback — Finds motifs in a graph and calls a function for each of them.

igraph_error_t igraph_motifs_randesu_callback(const igraph_t *graph, igraph_integer_t size,
                                   const igraph_vector_t *cut_prob, igraph_motifs_handler_t *callback,
                                   void* extra);

Similarly to igraph_motifs_randesu(), this function is able to find directed motifs of sizes three and four and undirected motifs of sizes three to six (i.e. the number of different subgraphs with three to six vertices in the network). However, instead of counting them, the function will call a callback function for each motif found to allow further tests or post-processing.

The cut_prob argument also allows sampling the motifs, just like for igraph_motifs_randesu(). Set the cut_prob argument to a zero vector for finding all motifs.

Arguments: 

graph:

The graph to find the motifs in.

size:

The size of the motifs to search for. Only three and four are implemented currently. The limitation is not in the motif finding code, but the graph isomorphism code.

cut_prob:

Vector of probabilities for cutting the search tree at a given level. The first element is the first level, etc. Supply all zeros here (of length size) to find all motifs in a graph.

callback:

A pointer to a function of type igraph_motifs_handler_t. This function will be called whenever a new motif is found.

extra:

Extra argument to pass to the callback function.

Returns: 

Error code.

Time complexity: TODO.

Example 19.3.  File examples/simple/igraph_motifs_randesu.c

#include <igraph.h>

/* This is a callback function suitable for use with igraph_motifs_randesu_callback().
 * It prints each motif it is calld with. */
igraph_error_t print_motif(const igraph_t *graph, igraph_vector_int_t *vids,
                          igraph_integer_t isoclass, void* extra) {
    printf("Found isoclass %2" IGRAPH_PRId ":  ", isoclass);
    igraph_vector_int_print(vids);
    return IGRAPH_SUCCESS; /* Return 'IGRAPH_SUCCESS': do not interrupt the search. */
}

int main(void) {

    igraph_t graph;
    igraph_vector_t hist;
    igraph_real_t zeros[] = { 0.0, 0.0, 0.0, 0.0, 0.0 };
    igraph_vector_t cut_prob;

    /* Compute the 4-motif distritbuion in Zachary's karate club network. */

    igraph_famous(&graph, "Zachary");
    igraph_vector_init(&hist, 0);

    igraph_motifs_randesu(&graph, &hist, 4, igraph_vector_view(&cut_prob, zeros, 4));

    /* Compute the total number of motifs (connected 4-vertex subgraphs)
     * so that we can print the normalized distribution. */
    igraph_real_t sum = 0.0;
    igraph_integer_t n = igraph_vector_size(&hist);
    for (igraph_integer_t i=0; i < n; i++) {
        if (!isnan(VECTOR(hist)[i])) {
            sum += VECTOR(hist)[i];
        }
    }
    printf("4-motif distribution:\n");
    for (igraph_integer_t i=0; i < n; i++) {
        /* Print NaN values in a platform-independent manner: */
        igraph_real_printf(VECTOR(hist)[i] / sum);
        printf(" ");
    }
    printf("\n\n");

    igraph_vector_destroy(&hist);
    igraph_destroy(&graph);

    /* Identify the vertices of each three-motif in a small Kautz graph. */

    igraph_kautz(&graph, 2, 1);
    igraph_motifs_randesu_callback(&graph, 3, igraph_vector_view(&cut_prob, zeros, 3), &print_motif, NULL);
    igraph_destroy(&graph);

    return 0;
}


4.5. igraph_motifs_handler_t — Callback type for igraph_motifs_randesu_callback.

typedef igraph_error_t igraph_motifs_handler_t(const igraph_t *graph,
        igraph_vector_int_t *vids,
        igraph_integer_t isoclass,
        void* extra);

igraph_motifs_randesu_callback() calls a specified callback function whenever a new motif is found during a motif search. This callback function must be of type igraph_motifs_handler_t. It has the following arguments:

Arguments: 

graph:

The graph that that algorithm is working on. Of course this must not be modified.

vids:

The IDs of the vertices in the motif that has just been found. This vector is owned by the motif search algorithm, so do not modify or destroy it; make a copy of it if you need it later.

isoclass:

The isomorphism class of the motif that has just been found. Use igraph_graph_count() to find the maximum possible isoclass for graphs of a given size. See igraph_isoclass and igraph_isoclass_subgraph for more information.

extra:

The extra argument that was passed to igraph_motifs_randesu_callback().

Returns: 

IGRAPH_SUCCESS to continue the motif search, IGRAPH_STOP to stop the motif search and return to the caller normally. Any other return value is interpreted as an igraph error code, which will terminate the search and return the same error code to the caller.

See also: