Fast Calculation of Pairwise Mutual Information Based on Kernel Estimation

We present a new software implementation for more efficiently computing the mutual information for all pairs of genes from gene expression microarrays. Computation of the mutual information is a necessary first step in various information theoretic approaches for reconstructing gene regulatory networks from microarray data. When the mutual information is estimated by kernel methods, computing the pairwise mutual information is quite time-consuming. Our implementation significantly reduces the computation time. The essential idea is simple. In practice, when we encounter computational structures such as nested for-loops, double summations and double integrals, switching the order of the procedures may sometimes result in significant gain. Our fast implementation utilizes this idea to reduce the repeated operations. For an example data set of 9563 genes and 336 samples, the current available software for ARACNE requires 142 hours to compute the mutual information between all possible pairs of genes, whereas our implementation requires 1.6 hours.

Matlab code for our fast implementation: FastPairMI.m

The Matlab code uses matrix operations, which may not be available in other programming languages/environments. If you want to translate it into other languages, it will be easier to follow this file: FastPairMI_pseudo_code_version.m, which exactly matches the pseudo code provided in the manuscript.

For the purpose of comparison, the implementation of ARACNE can be found at http://amdec-bioinfo.cu-genome.org/html/ARACNE.htm

This work is published in Peng Qiu, Andrew J. Gentles, and Sylvia K. Plevritis, "Fast Calculation of Pairwise Mutual Information for Gene Regulatory Network Reconstruction", Computer Methods and Programs in Biomedicine, 94(2):177-180, 2009.

If you have any questions or find any problems in it, please email me at peng.qiu@bme.gateche.edu.

Last update: 08/05/2013