igraph Reference Manual

For using the igraph C library

Chapter 32. Non-graph related functions

1. igraph version number
2. Running mean of a time series
3. Random sampling from very long sequences
4. Random sampling of spatial points
5. Convex hull of a set of points on a plane
6. Fitting power-law distributions to empirical data
7. Comparing floats with a tolerance

1. igraph version number

1.1. igraph_version — The version of the igraph C library.

1.1. `igraph_version` — The version of the igraph C library.

void igraph_version(const char **version_string,
                    int *major,
                    int *minor,
                    int *subminor);

Arguments:

`version_string`:	Pointer to a string pointer. If not null, it is set to the igraph version string, e.g. "0.9.11" or "0.10.0". This string must not be modified or deallocated.
`major`:	If not a null pointer, then it is set to the major igraph version. E.g. for version "0.9.11" this is 0.
`minor`:	If not a null pointer, then it is set to the minor igraph version. E.g. for version "0.9.11" this is 11.
`subminor`:	If not a null pointer, then it is set to the subminor igraph version. E.g. for version "0.9.11" this is 11.

Example 32.1. File examples/simple/igraph_version.c

#include <igraph.h>
#include <string.h>

int main(void) {

    char tmp[100];
    const char *string;
    int major, minor, subminor;

    igraph_version(&string, &major, &minor, &subminor);
    sprintf(tmp, "%i.%i.%i", major, minor, subminor);

    if (strncmp(string, tmp, strlen(tmp))) {
        return 1;
    }

    return 0;
}

2. Running mean of a time series

2.1. igraph_running_mean — Calculates the running mean of a vector.

2.1. `igraph_running_mean` — Calculates the running mean of a vector.

igraph_error_t igraph_running_mean(const igraph_vector_t *data, igraph_vector_t *res,
                        igraph_integer_t binwidth);

The running mean is defined by the mean of the previous binwidth values.

Arguments:

`data`:	The vector containing the data.
`res`:	The vector containing the result. This should be initialized before calling this function and will be resized.
`binwidth`:	Integer giving the width of the bin for the running mean calculation.

Returns:

Error code.

Time complexity: O(n), n is the length of the data vector.

3. Random sampling from very long sequences

3.1. igraph_random_sample — Generates an increasing random sequence of integers.

3.1. `igraph_random_sample` — Generates an increasing random sequence of integers.

igraph_error_t igraph_random_sample(igraph_vector_int_t *res, igraph_integer_t l, igraph_integer_t h,
                         igraph_integer_t length);

This function generates an increasing sequence of random integer numbers from a given interval. The algorithm is taken literally from (Vitter 1987). This method can be used for generating numbers from a very large interval. It is primarily created for randomly selecting some edges from the sometimes huge set of possible edges in a large graph.

Reference:

J. S. Vitter. An efficient algorithm for sequential random sampling. ACM Transactions on Mathematical Software, 13(1):58--67, 1987. https://doi.org/10.1145/23002.23003

Arguments:

`res`:	Pointer to an initialized vector. This will hold the result. It will be resized to the proper size.
`l`:	The lower limit of the generation interval (inclusive). This must be less than or equal to the upper limit, and it must be integral.
`h`:	The upper limit of the generation interval (inclusive). This must be greater than or equal to the lower limit, and it must be integral.
`length`:	The number of random integers to generate.

Returns:

The error code IGRAPH_EINVAL is returned in each of the following cases: (1) The given lower limit is greater than the given upper limit, i.e. l > h. (2) Assuming that l < h and N is the sample size, the above error code is returned if N > |h - l|, i.e. the sample size exceeds the size of the candidate pool.

Time complexity: according to (Vitter 1987), the expected running time is O(length).

Example 32.2. File examples/simple/igraph_random_sample.c

#include <igraph.h>
#include <math.h>
#include <stdio.h>

#define R_INTEGER(a,b) (igraph_rng_get_integer(igraph_rng_default(), (a), (b)))

/* test parameters */
typedef struct {
    igraph_integer_t low;
    igraph_integer_t high;
    igraph_integer_t length;
    igraph_error_t retval;
} sampling_test_t;

/* Get a few random samples and test their properties. */
int random_sample_test(void) {
    const igraph_integer_t min = -1000;
    const igraph_integer_t max = 1000;
    igraph_integer_t low;       /* lower limit */
    igraph_integer_t high;      /* upper limit */
    igraph_integer_t length;    /* sample size */
    igraph_integer_t poolsize;  /* size of candidate pool */
    igraph_real_t sP;           /* population total sum */
    igraph_real_t ss;           /* sample total sum */
    igraph_vector_int_t V;
    igraph_integer_t i;

    igraph_rng_seed(igraph_rng_default(), 57); /* make tests deterministic */

    /* The generated sequence of numbers must be in increasing order. */
    igraph_vector_int_init(&V, /*size*/ 0);
    do {
        high = (igraph_integer_t)R_INTEGER(min, max);
    } while (high == min);
    do {
        low = (igraph_integer_t)R_INTEGER(min, max);
    } while (low >= high);
    poolsize = (igraph_integer_t)fabs((double)high - (double)low);
    do {
        length = (igraph_integer_t)R_INTEGER(1, max);
    } while (length > poolsize);
    igraph_random_sample(&V, low, high, length);
    if (length != igraph_vector_int_size(&V)) {
        printf("Requested vector length and resulting length mismatch.\n");
        return 1;
    }
    for (i = 0; i < length - 1; i++) {
        if (VECTOR(V)[i] >= VECTOR(V)[i + 1]) {
            printf("Sample not in increasing order.\n");
            return 1;
        }
    }
    igraph_vector_int_destroy(&V);

    /* Let P be a candidate pool of positive integers with total sum s_P. */
    /* Let S be a random sample from P and having total sum s_S. Then we */
    /* have the bound s_s <= s_P. */
    igraph_vector_int_init(&V, /*size*/ 0);
    low = 1;
    do {
        high = (igraph_integer_t)R_INTEGER(low, max);
    } while (high == low);
    poolsize = (igraph_integer_t)fabs((double)high - (double)low);
    do {
        length = (igraph_integer_t)R_INTEGER(low, max);
    } while (length > poolsize);
    igraph_random_sample(&V, low, high, length);
    /* Use Gauss' formula to sum all consecutive positive integers from 1 */
    /* up to and including an upper limit. In LaTeX, Gauss' formula is */
    /* \sum_{i=1}^n i = \frac{n(n+1)}{2} where n is the upper limit. */
    sP = (high * (high + 1)) / 2;
    ss = igraph_vector_int_sum(&V);
    if (ss > sP) {
        printf("Sum of sampled sequence exceeds sum of whole population.\n");
        return 1;
    }
    igraph_vector_int_destroy(&V);

    return 0;
}

int equal_test(void) {
    igraph_vector_int_t V;

    igraph_vector_int_init(&V, 0);

    igraph_random_sample(&V, 0, 0, 1);
    if (igraph_vector_int_size(&V) != 1) {
        return 1;
    }
    if (VECTOR(V)[0] != 0) {
        return 2;
    }

    igraph_random_sample(&V, 10, 10, 1);
    if (igraph_vector_int_size(&V) != 1) {
        return 3;
    }
    if (VECTOR(V)[0] != 10) {
        return 4;
    }

    igraph_random_sample(&V, 2, 12, 11);
    if (igraph_vector_int_size(&V) != 11) {
        return 5;
    }
    for (igraph_integer_t i = 0; i < 11; i++)
        if (VECTOR(V)[i] != i + 2) {
            return 6;
        }

    igraph_vector_int_destroy(&V);
    return 0;
}

int rare_test(void) {
    igraph_vector_int_t V;

    igraph_vector_int_init(&V, 0);

    igraph_random_sample(&V, 0, 0, 1);
    if (igraph_vector_int_size(&V) != 1) {
        return 1;
    }
    if (VECTOR(V)[0] != 0) {
        return 2;
    }

    igraph_random_sample(&V, 10, 10, 1);
    if (igraph_vector_int_size(&V) != 1) {
        return 3;
    }
    if (VECTOR(V)[0] != 10) {
        return 4;
    }

    igraph_vector_int_destroy(&V);
    return 0;
}

int main(void) {
    int ret;

    ret = random_sample_test();
    if (ret) {
        return 2;
    }
    ret = equal_test();
    if (ret) {
        return 3;
    }
    ret = rare_test();
    if (ret) {
        return 4;
    }

    return 0;
}

4. Random sampling of spatial points

4.1. igraph_sample_sphere_surface — Sample points uniformly from the surface of a sphere.
4.2. igraph_sample_sphere_volume — Sample points uniformly from the volume of a sphere.
4.3. igraph_sample_dirichlet — Sample points from a Dirichlet distribution.

4.1. `igraph_sample_sphere_surface` — Sample points uniformly from the surface of a sphere.

igraph_error_t igraph_sample_sphere_surface(igraph_integer_t dim, igraph_integer_t n,
                                 igraph_real_t radius,
                                 igraph_bool_t positive,
                                 igraph_matrix_t *res);

The center of the sphere is at the origin.

Arguments:

`dim`:	The dimension of the random vectors.
`n`:	The number of vectors to sample.
`radius`:	Radius of the sphere, it must be positive.
`positive`:	Whether to restrict sampling to the positive orthant.
`res`:	Pointer to an initialized matrix, the result is stored here, each column will be a sampled vector. The matrix is resized, as needed.

Returns:

Error code.

Time complexity: O(n*dim*g), where g is the time complexity of generating a standard normal random number.

See also:

igraph_sample_sphere_volume(), igraph_sample_dirichlet() for other similar samplers.

4.2. `igraph_sample_sphere_volume` — Sample points uniformly from the volume of a sphere.

igraph_error_t igraph_sample_sphere_volume(igraph_integer_t dim, igraph_integer_t n,
                                igraph_real_t radius,
                                igraph_bool_t positive,
                                igraph_matrix_t *res);

The center of the sphere is at the origin.

Arguments:

`dim`:	The dimension of the random vectors.
`n`:	The number of vectors to sample.
`radius`:	Radius of the sphere, it must be positive.
`positive`:	Whether to restrict sampling to the positive orthant.
`res`:	Pointer to an initialized matrix, the result is stored here, each column will be a sampled vector. The matrix is resized, as needed.

Returns:

Error code.

Time complexity: O(n*dim*g), where g is the time complexity of generating a standard normal random number.

See also:

igraph_sample_sphere_surface(), igraph_sample_dirichlet() for other similar samplers.

4.3. `igraph_sample_dirichlet` — Sample points from a Dirichlet distribution.

igraph_error_t igraph_sample_dirichlet(igraph_integer_t n, const igraph_vector_t *alpha,
                            igraph_matrix_t *res);

Arguments:

`n`:	The number of vectors to sample.
`alpha`:	The parameters of the Dirichlet distribution. They must be positive. The length of this vector gives the dimension of the generated samples.
`res`:	Pointer to an initialized matrix, the result is stored here, one sample in each column. It will be resized, as needed.

Returns:

Error code.

Time complexity: O(n * dim * g), where dim is the dimension of the sample vectors, set by the length of alpha, and g is the time complexity of sampling from a Gamma distribution.

See also:

igraph_sample_sphere_surface() and igraph_sample_sphere_volume() for other methods to sample latent vectors.

5. Convex hull of a set of points on a plane

5.1. igraph_convex_hull — Determines the convex hull of a given set of points in the 2D plane.

5.1. `igraph_convex_hull` — Determines the convex hull of a given set of points in the 2D plane.

igraph_error_t igraph_convex_hull(
    const igraph_matrix_t *data, igraph_vector_int_t *resverts,
    igraph_matrix_t *rescoords
);

The convex hull is determined by the Graham scan algorithm. See the following reference for details:

Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0262032937. Pages 949-955 of section 33.3: Finding the convex hull.

Arguments:

`data`:	vector containing the coordinates. The length of the vector must be even, since it contains X-Y coordinate pairs.
`resverts`:	the vector containing the result, e.g. the vector of vertex indices used as the corners of the convex hull. Supply `NULL` here if you are only interested in the coordinates of the convex hull corners.
`rescoords`:	the matrix containing the coordinates of the selected corner vertices. Supply `NULL` here if you are only interested in the vertex indices.

Returns:

Error code: IGRAPH_ENOMEM: not enough memory

Time complexity: O(n log(n)) where n is the number of vertices.

6. Fitting power-law distributions to empirical data

6.1. igraph_plfit_result_t — Result of fitting a power-law distribution to a vector.
6.2. igraph_power_law_fit — Fits a power-law distribution to a vector of numbers.
6.3. igraph_plfit_result_calculate_p_value — Calculates the p-value of a fitted power-law model.

6.1. `igraph_plfit_result_t` — Result of fitting a power-law distribution to a vector.

typedef struct igraph_plfit_result_t {
    igraph_bool_t continuous;
    igraph_real_t alpha;
    igraph_real_t xmin;
    igraph_real_t L;
    igraph_real_t D;
    const igraph_vector_t* data;
} igraph_plfit_result_t;

This data structure contains the result of igraph_power_law_fit(), which tries to fit a power-law distribution to a vector of numbers. The structure contains the following members:

Values:

`continuous`:	Whether the fitted power-law distribution was continuous or discrete.
`alpha`:	The exponent of the fitted power-law distribution.
`xmin`:	The minimum value from which the power-law distribution was fitted. In other words, only the values larger than `xmin` were used from the input vector.
`L`:	The log-likelihood of the fitted parameters; in other words, the probability of observing the input vector given the parameters.
`D`:	The test statistic of a Kolmogorov-Smirnov test that compares the fitted distribution with the input vector. Smaller scores denote better fit.
`p`:	The p-value of the Kolmogorov-Smirnov test; `NaN` if it has not been calculated yet. Small p-values (less than 0.05) indicate that the test rejected the hypothesis that the original data could have been drawn from the fitted power-law distribution.
`data`:	The vector containing the original input data. May not be valid any more if the caller already destroyed the vector.

6.2. `igraph_power_law_fit` — Fits a power-law distribution to a vector of numbers.

igraph_error_t igraph_power_law_fit(
    const igraph_vector_t* data, igraph_plfit_result_t* result,
    igraph_real_t xmin, igraph_bool_t force_continuous
);

This function fits a power-law distribution to a vector containing samples from a distribution (that is assumed to follow a power-law of course). In a power-law distribution, it is generally assumed that P(X=x) is proportional to x^-alpha, where x is a positive number and alpha is greater than 1. In many real-world cases, the power-law behaviour kicks in only above a threshold value xmin. The goal of this functions is to determine alpha if xmin is given, or to determine xmin and the corresponding value of alpha.

The function uses the maximum likelihood principle to determine alpha for a given xmin; in other words, the function will return the alpha value for which the probability of drawing the given sample is the highest. When xmin is not given in advance, the algorithm will attempt to find the optimal xmin value for which the p-value of a Kolmogorov-Smirnov test between the fitted distribution and the original sample is the largest. The function uses the method of Clauset, Shalizi and Newman to calculate the parameters of the fitted distribution. See the following reference for details:

Aaron Clauset, Cosma R. Shalizi and Mark E.J. Newman: Power-law distributions in empirical data. SIAM Review 51(4):661-703, 2009. https://doi.org/10.1137/070710111

Arguments:

`data`:	vector containing the samples for which a power-law distribution is to be fitted. Note that you have to provide the samples, not the probability density function or the cumulative distribution function. For example, if you wish to fit a power-law to the degrees of a graph, you can use the output of `igraph_degree` directly as an input argument to `igraph_power_law_fit`
`result`:	the result of the fitting algorithm. See `igraph_plfit_result_t` for more details. Note that the p-value of the fit is not calculated by default as it is time-consuming; you need to call `igraph_plfit_result_calculate_p_value()` to calculate the p-value itself
`xmin`:	the minimum value in the sample vector where the power-law behaviour is expected to kick in. Samples smaller than `xmin` will be ignored by the algorithm. Pass zero here if you want to include all the samples. If `xmin` is negative, the algorithm will attempt to determine its best value automatically.
`force_continuous`:	assume that the samples in the `data` argument come from a continuous distribution even if the sample vector contains integer values only (by chance). If this argument is false, igraph will assume a continuous distribution if at least one sample is non-integer and assume a discrete distribution otherwise.

Returns:

Error code: IGRAPH_ENOMEM: not enough memory IGRAPH_EINVAL: one of the arguments is invalid IGRAPH_EOVERFLOW: overflow during the fitting process IGRAPH_EUNDERFLOW: underflow during the fitting process IGRAPH_FAILURE: the underlying algorithm signaled a failure without returning a more specific error code

Time complexity: in the continuous case, O(n log(n)) if xmin is given. In the discrete case, the time complexity is dominated by the complexity of the underlying L-BFGS algorithm that is used to optimize alpha. If xmin is not given, the time complexity is multiplied by the number of unique samples in the input vector (although it should be faster in practice).

Example 32.3. File examples/simple/igraph_power_law_fit.c

#include <igraph.h>

int main(void) {
    igraph_t g;
    igraph_vector_t degree;
    igraph_plfit_result_t model;

    /* Seed random number generator to ensure reproducibility. */
    igraph_rng_seed(igraph_rng_default(), 42);

    /* Generate a BA network; degree distribution is supposed to be a power-law
     * if the graph is large enough */
    igraph_barabasi_game(
        &g, 10000, /*power=*/ 1, /*m=*/ 2,
        /* outseq= */ 0, /* outpref= */ 0, /*A=*/ 1,
        IGRAPH_UNDIRECTED, IGRAPH_BARABASI_BAG,
        /*start_from=*/ 0
    );

    /* Get the vertex degrees. We use igraph_strength() because it stores its
     * result in an igraph_vector_t */
    igraph_vector_init(&degree, 0);
    igraph_strength(&g, &degree, igraph_vss_all(), IGRAPH_ALL, IGRAPH_NO_LOOPS, 0);

    /* Fit a power-law to the degrees */
    igraph_power_law_fit(
        &degree, &model, /* xmin = */ -1,
        /* force_continuous = */ 0
    );

    /* If you also need a p-value: */
    /* igraph_plfit_result_calculate_p_value(&model, &p, 0.001); */

    printf("alpha = %.5f\n", model.alpha);
    printf("xmin = %.5f\n", model.xmin);
    printf("log-likelihood = %.5f\n", model.L);

    igraph_vector_destroy(&degree);
    igraph_destroy(&g);

    return 0;
}

6.3. `igraph_plfit_result_calculate_p_value` — Calculates the p-value of a fitted power-law model.

igraph_error_t igraph_plfit_result_calculate_p_value(
    const igraph_plfit_result_t* model, igraph_real_t* result, igraph_real_t precision
);

The p-value is calculated by resampling the input data many times in a way that the part below the fitted x_min threshold is resampled from the input data itself, while the part above the fitted x_min threshold is drawn from the fitted power-law function. A Kolmogorov-Smirnov test is then performed for each resampled dataset and its test statistic is compared with the observed test statistic from the original dataset. The fraction of resampled datasets that have a higher test statistic is the returned p-value.

Note that the precision of the returned p-value depends on the number of resampling attempts. The number of resampling trials is determined by 0.25 divided by the square of the required precision. For instance, a required precision of 0.01 means that 2500 samples will be drawn.

If igraph is compiled with OpenMP support, this function will use parallel OpenMP threads for the resampling. Each OpenMP thread gets its own instance of a random number generator. However, since the scheduling of OpenMP threads is outside our control, we cannot guarantee how many resampling instances the threads are asked to execute, thus it may happen that the random number generators are used differently between runs. If you want to obtain reproducible results, seed igraph's master RNG appropriately, and force the number of OpenMP threads to 1 early in your program, either by calling omp_set_num_threads(1) or by setting the value of the OMP_NUM_THREADS environment variable to 1.

Arguments:

`model`:	The fitted power-law model from the `igraph_power_law_fit()` function
`result`:	The calculated p-value is returned here
`precision`:	The desired precision of the p-value. Higher values correspond to longer calculation time. @return igraph_error_t

7. Comparing floats with a tolerance

7.1. igraph_cmp_epsilon — Compare two double-precision floats with a tolerance.
7.2. igraph_almost_equals — Compare two double-precision floats with a tolerance.
7.3. igraph_complex_almost_equals — Compare two complex numbers with a tolerance.

7.1. `igraph_cmp_epsilon` — Compare two double-precision floats with a tolerance.

int igraph_cmp_epsilon(double a, double b, double eps);

Determines whether two double-precision floats are "almost equal" to each other with a given level of tolerance on the relative error.

Arguments:

`a`:	The first float.
`b`:	The second float.
`eps`:	The level of tolerance on the relative error. The relative error is defined as `abs(a-b) / (abs(a) + abs(b))`. The two numbers are considered equal if this is less than `eps`.

Returns:

Zero if the two floats are nearly equal to each other within the given level of tolerance, positive number if the first float is larger, negative number if the second float is larger.

7.2. `igraph_almost_equals` — Compare two double-precision floats with a tolerance.

igraph_bool_t igraph_almost_equals(double a, double b, double eps);

Determines whether two double-precision floats are "almost equal" to each other with a given level of tolerance on the relative error.

Arguments:

`a`:	The first float.
`b`:	The second float.
`eps`:	The level of tolerance on the relative error. The relative error is defined as `abs(a-b) / (abs(a) + abs(b))`. The two numbers are considered equal if this is less than `eps`.

Returns:

True if the two floats are nearly equal to each other within the given level of tolerance, false otherwise.

7.3. `igraph_complex_almost_equals` — Compare two complex numbers with a tolerance.

igraph_bool_t igraph_complex_almost_equals(igraph_complex_t a,
                                           igraph_complex_t b,
                                           igraph_real_t eps);

Determines whether two complex numbers are "almost equal" to each other with a given level of tolerance on the relative error.

Arguments:

`a`:	The first complex number.
`b`:	The second complex number.
`eps`:	The level of tolerance on the relative error. The relative error is defined as `abs(a-b) / (abs(a) + abs(b))`. The two numbers are considered equal if this is less than `eps`.

Returns:

True if the two complex numbers are nearly equal to each other within the given level of tolerance, false otherwise.

igraph Reference Manual

Chapter 32. Non-graph related functions

1. igraph version number

1.1. igraph_version — The version of the igraph C library.

2. Running mean of a time series

2.1. igraph_running_mean — Calculates the running mean of a vector.

3. Random sampling from very long sequences

3.1. igraph_random_sample — Generates an increasing random sequence of integers.

4. Random sampling of spatial points

4.1. igraph_sample_sphere_surface — Sample points uniformly from the surface of a sphere.

4.2. igraph_sample_sphere_volume — Sample points uniformly from the volume of a sphere.

4.3. igraph_sample_dirichlet — Sample points from a Dirichlet distribution.

5. Convex hull of a set of points on a plane

5.1. igraph_convex_hull — Determines the convex hull of a given set of points in the 2D plane.

6. Fitting power-law distributions to empirical data

6.1. igraph_plfit_result_t — Result of fitting a power-law distribution to a vector.

6.2. igraph_power_law_fit — Fits a power-law distribution to a vector of numbers.

6.3. igraph_plfit_result_calculate_p_value — Calculates the p-value of a fitted power-law model.

7. Comparing floats with a tolerance

7.1. igraph_cmp_epsilon — Compare two double-precision floats with a tolerance.

7.2. igraph_almost_equals — Compare two double-precision floats with a tolerance.

7.3. igraph_complex_almost_equals — Compare two complex numbers with a tolerance.

1.1. `igraph_version` — The version of the igraph C library.

2.1. `igraph_running_mean` — Calculates the running mean of a vector.

3.1. `igraph_random_sample` — Generates an increasing random sequence of integers.

4.1. `igraph_sample_sphere_surface` — Sample points uniformly from the surface of a sphere.

4.2. `igraph_sample_sphere_volume` — Sample points uniformly from the volume of a sphere.

4.3. `igraph_sample_dirichlet` — Sample points from a Dirichlet distribution.

5.1. `igraph_convex_hull` — Determines the convex hull of a given set of points in the 2D plane.

6.1. `igraph_plfit_result_t` — Result of fitting a power-law distribution to a vector.

6.2. `igraph_power_law_fit` — Fits a power-law distribution to a vector of numbers.

6.3. `igraph_plfit_result_calculate_p_value` — Calculates the p-value of a fitted power-law model.

7.1. `igraph_cmp_epsilon` — Compare two double-precision floats with a tolerance.

7.2. `igraph_almost_equals` — Compare two double-precision floats with a tolerance.

7.3. `igraph_complex_almost_equals` — Compare two complex numbers with a tolerance.