Finite-size effects in on-line learning of multilayer neural networks
Department of Computer Science and Applied Mathematics, Aston University, Birmingham
B4 7ET, UK
2 Department of Physics, University of Edinburgh, Kings Buildings, Edinburgh EH9 3JZ, UK
Accepted: 26 February 1996
We complement recent advances in thermodynamic limit analyses of mean on-line gradient descent learning dynamics in multilayer networks by calculating fluctuations possessed by finite-dimensional systems. Fluctuations from the mean dynamics are largest at the onset of specialisation as student hidden unit weight vectors begin to imitate specific teacher vectors, increasing with the degree of symmetry of the initial conditions. In light of this, we include a term to stimulate asymmetry in the learning process, which typically also leads to a significant decrease in training time.
PACS: 87.10.+e – General, theoretical, and mathematical biophysics (including logic of biosystems, quantum biology, and relevant aspects of thermodynamics, information theory, cybernetics, and bionics) / 02.50.-r – Probability theory, stochastic processes, and statistics
© EDP Sciences, 1996