1-D random landscapes and non-random data seriesT. M. A. Fink1, 2, K. Willbrand3 and F. C. S. Brown4
1 Systems Biology and CNRS UMR 144, Institut Curie Paris 75005, France
2 Theory of Condensed Matter, Cavendish Laboratory - Cambridge CB3 0HE, UK
3 Laboratoire de Physique Statistique, Ecole Normale Supérieure - 75005 Paris, France
4 Departement de Mathematique, Ecole Normale Supérieure - 75005 Paris, France
received 4 May 2007; accepted in final form 11 June 2007; published August 2007
published online 17 July 2007
We study the simplest random landscape, the curve formed by joining consecutive data points with line segments, where the fi are i.i.d. random numbers and . We label each segment increasing (+) or decreasing (-) and call this string of +'s and -'s the up-down signature . We calculate the probability for a random curve and use it to bound the algorithmic information content of f. We show that f can be compressed by bits, where k is a universal currency for comparing the amount of pattern in different curves. By applying our results to microarray time series data, we blindly identify regulatory genes.
89.70.+c - Information theory and communication theory.
87.14.Gg - DNA, RNA.
89.75.-k - Complex systems.
© Europhysics Letters Association 2007