Maximal Relevance and Optimal Learning Machines
Sep 27, 201930 pages
Published in:
- J.Stat.Mech. 2103 (2021) 033409
- Published: Mar 19, 2021
e-Print:
- 1909.12792 [physics.data-an]
Citations per year
Abstract: (IOP)
We explore the hypothesis that learning machines extract representations of maximal relevance, where the relevance is defined as the entropy of the energy distribution of the internal representation. We show that the mutual information between the internal representation of a learning machine and the features that it extracts from the data is bounded from below by the relevance. This motivates our study of models with maximal relevance—that we call optimal learning machines—as candidates of maximally informative representations. We analyse how the maximisation of the relevance is constrained both by the architecture of the model used and by the available data, in practical cases. We find that sub-extensive features that do not affect the thermodynamics of the model, may affect significantly learning performance, and that criticality enhances learning performance, but the existence of a critical point is not a necessary condition. On specific learning tasks, we find that (i) the maximal values of the likelihood are achieved by models with maximal relevance, (ii) internal representations approach the maximal relevance that can be achieved in a finite dataset and (iii) learning is associated with a broadening of the spectrum of energy levels of the internal representation, in agreement with the maximum relevance hypothesis.Note:
- 28 pages, 9 figures
References(34)
Figures(15)