Sampling bias in systems with structural heterogeneity and limited internal diffusionJ.-P. Onnela1, 2, 3, N. F. Johnson4, S. Gourley1, 2, G. Reinert5 and M. Spagat6
1 Department of Physics, University of Oxford - Oxford OX1 3PU, UK, EU
2 CABDyN Research Cluster, Said Business School, University of Oxford - Oxford, OX1 1HP, UK, EU
3 DBEC, Helsinki University of Technology - P.O. Box 9203, Helsinki, FIN-02015 HUT, Finland, EU
4 Physics Department, University of Miami - Coral Gables, FL 33124, USA
5 Department of Statistics, University of Oxford - Oxford OX1 3TG, UK, EU
6 Department of Economics, Royal Holloway, University of London - London, TW20 0EX, UK, EU
received 13 October 2008; accepted in final form 22 December 2008; published January 2009
published online 29 January 2009
Complex-systems research is becomingly increasingly data-driven, particularly in the social and biological domains. Many of the systems from which sample data are collected feature structural heterogeneity at the mesoscopic scale (i.e. communities) and limited inter-community diffusion. Here we show that the interplay between these two features can yield a significant bias in the global characteristics inferred from the data. We present a general framework to quantify this bias, and derive an explicit corrective factor for a wide class of systems. Applying our analysis to a recent high-profile survey of conflict mortality in Iraq suggests a significant overestimate of deaths.
89.65.-s - Social and economic systems.
89.75.Fb - Structures and organization in complex systems.
89.75.-k - Complex systems.
© EPLA 2009