Hierarchical clustering using mutual informationA. Kraskov1, 2, H. Stögbauer1, R. G. Andrzejak1 and P. Grassberger1
1 John-von-Neumann Institute for Computing, Forschungszentrum Jülich D-52425 Jülich, Germany
2 Division of Biology, MC 139-74, California Institute of Technology Pasadena, CA 91125, USA
received 8 June 2004; accepted in final form 1 March 2005
published online 25 March 2005
We present a conceptually simple method for hierarchical clustering of data called mutual information clustering (MIC) algorithm. It uses mutual information (MI) as a similarity measure and exploits its grouping property: The MI between three objects X, Y, and Z is equal to the sum of the MI between X and Y, plus the MI between Z and the combined object (XY). We use this both in the Shannon (probabilistic) version of information theory and in the Kolmogorov (algorithmic) version. We apply our method to the construction of phylogenetic trees from mitochondrial DNA sequences and to the output of independent components analysis (ICA) as illustrated with the ECG of a pregnant woman.
89.70.+c - Information theory and communication theory.
89.75.Hc - Networks and genealogical trees.
87.19.Hh - Cardiac dynamics.
© EDP Sciences 2005