In recent years many theories explaining the behavior of Wide Neural Networks have been proposed, focusing on relations of wide networks with Neural Tangent Kernels and on devising a novel optimization theory for overparameterized models. However, despite the efforts, real-world models are still not well-understood. To this aim, we empirically measure crucial quantities for neural networks in the more realistic setting of mildly overparameterized models and in three main areas: con- ditioning of the optimization process, training speed, and generalization of the obtained models. We analyze the obtained results and highlight discrepancies between existing theories and realistic models, to guide future works on theoretical refinements. Our contribution is exploratory in nature and aims to encourage the development of mixed theoretical-practical approaches, where experiments are quantitative and aimed at measuring fundamental quantities of the existing theories.

An Empirical Verification of Wide Networks Theory

Balboni, Dario;Bacciu, Davide
2022

Abstract

In recent years many theories explaining the behavior of Wide Neural Networks have been proposed, focusing on relations of wide networks with Neural Tangent Kernels and on devising a novel optimization theory for overparameterized models. However, despite the efforts, real-world models are still not well-understood. To this aim, we empirically measure crucial quantities for neural networks in the more realistic setting of mildly overparameterized models and in three main areas: con- ditioning of the optimization process, training speed, and generalization of the obtained models. We analyze the obtained results and highlight discrepancies between existing theories and realistic models, to guide future works on theoretical refinements. Our contribution is exploratory in nature and aims to encourage the development of mixed theoretical-practical approaches, where experiments are quantitative and aimed at measuring fundamental quantities of the existing theories.
2022
Settore INFO-01/A - Informatica
The 33rd British Machine Vision Conference (BMVC 2022)
London, UK
21st - 24th November 2022
British Machine Vision Archives
British Machine Vision Association, BMVA
File in questo prodotto:
File Dimensione Formato  
0517.pdf

accesso aperto

Tipologia: Accepted version (post-print)
Licenza: Solo Lettura
Dimensione 702.26 kB
Formato Adobe PDF
702.26 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11384/148809
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact