On-line learning from finite training sets
Department of Physics, University of Edinburgh - Edinburgh EH9 3JZ, UK
2 Neural Computing Research Group, Aston University - Birmingham B4 7ET, UK
Accepted: 3 April 1997
We analyse on-line (gradient descent) learning of a rule from a finite set of training examples at non-infinitesimal learning rates η, calculating exactly the time-dependent generalization error for a simple model scenario. In the thermodynamic limit, we close the dynamical equation for the generating function of an infinite hierarchy of order parameters using “within-sample self-averaging”. The resulting dynamics is non-perturbative in η, with a slow mode appearing only above a finite threshold . Optimal settings of η for given final learning time are determined and the results are compared with offline gradient descent.
PACS: 87.10.+e – General, theoretical, and mathematical biophysics (including logic of biosystems, quantum biology, and relevant aspects of thermodynamics, information theory, cybernetics, and bionics) / 02.50.-r – Probability theory, stochastic processes, and statistics / 05.90.+m – Other topics in statistical physics and thermodynamics
© EDP Sciences, 1997