Maximal Relevance and Optimal Learning Machines

Duranthon, O.; Marsili, M.; Xie, R.

doi:10.1088/1742-5468/abe6ff

Maximal Relevance and Optimal Learning Machines

O. Duranthon
(
- Ecole Normale Supérieure, Paris
)
,
M. Marsili
(
- ICTP, Trieste and
- INFN, Trieste
)
,
R. Xie

Sep 27, 2019

30 pages

Published in:

J.Stat.Mech. 2103 (2021) 033409

Published: Mar 19, 2021

e-Print:

1909.12792 [physics.data-an]

DOI:

10.1088/1742-5468/abe6ff

View in:

pdf

reference search1 citation

Citations per year

Abstract: (IOP)

We explore the hypothesis that learning machines extract representations of maximal relevance, where the relevance is defined as the entropy of the energy distribution of the internal representation. We show that the mutual information between the internal representation of a learning machine and the features that it extracts from the data is bounded from below by the relevance. This motivates our study of models with maximal relevance—that we call optimal learning machines—as candidates of maximally informative representations. We analyse how the maximisation of the relevance is constrained both by the architecture of the model used and by the available data, in practical cases. We find that sub-extensive features that do not affect the thermodynamics of the model, may affect significantly learning performance, and that criticality enhances learning performance, but the existence of a critical point is not a necessary condition. On specific learning tasks, we find that (i) the maximal values of the likelihood are achieved by models with maximal relevance, (ii) internal representations approach the maximal relevance that can be achieved in a finite dataset and (iii) learning is associated with a broadening of the spectrum of energy levels of the internal representation, in agreement with the maximum relevance hypothesis.

Note: