A theory of web traffic

M. V. Simkin and V. P. Roychowdhury

received 16 November 2007; accepted in final form 25 February 2008; published April 2008
We analyze access statistics of several popular webpages for a period of several years. The graphs of daily downloads are highly non-homogeneous with long periods of low activity interrupted by bursts of heavy traffic. These bursts are due to avalanches of blog entries, referring to the page. We quantitatively explain this behavior using the theory of branching processes. We extrapolate these findings to construct a model of the entire web. According to the model, the competition between webpages for viewers pushes the web into a self-organized critical state. In this regime, the most interesting webpages are in a near-critical state, with a power law distribution of traffic intensity.

89.20.Hh - World Wide Web, Internet.
05.65.+b - Self-organized systems.

