Learning Curve Theory

Determine to which extent power laws are universal or depend on the data distribution or loss function.

Featured image

Description

Recently a number of empirical “universal” scaling law papers have been published, most notably by Open AI. But, theoretical understanding of this phenomenon is largely lacking. This paper develops and theoretically analyse the simplest possible (toy) model that can exhibit n^-β learning curves for arbitrary power β > 0, and determine to which extent power laws are universal or depend on the data distribution or loss function.

Video

References

Learning Curve Theory