ADLER : An Efficient Hessian-based Strategy for Adaptive Learning Rate

We derive a sound positive semi-definite approximation of the Hessian of deep models for which Hessian-vector products are easily computable. This enables us to provide an adaptive SGD learning rate strategy based on the minimization of the local quadratic approximation, which requires just twice the computation of a single SGD run, but performs comparably with grid search on SGD learning rates on different model architectures (CNN with and without residual connections) on classification tasks. We also compare the novel approximation with the Gauss-Newton approximation.

ADLER : An Efficient Hessian-based Strategy for Adaptive Learning Rate

Balboni, Dario;Bacciu, Davide

2024

Abstract

We derive a sound positive semi-definite approximation of the Hessian of deep models for which Hessian-vector products are easily computable. This enables us to provide an adaptive SGD learning rate strategy based on the minimization of the local quadratic approximation, which requires just twice the computation of a single SGD run, but performs comparably with grid search on SGD learning rates on different model architectures (CNN with and without residual connections) on classification tasks. We also compare the novel approximation with the Gauss-Newton approximation.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2024
			
	Settore Scientifico Disciplinare (validi dal 09/05/2024)
	
				Settore INFO-01/A - Informatica
			
	Titolo del Convegno
	
				32nd European Symposium on Artificial Neural Networks
			
	Luogo del Convegno
	
				Bruges, Belgium
			
	Periodo del Convegno
	
				October 2024
			
	Titolo del Volume
	
				ESANN 2024 Proceedings
			
	ISBN
	
				9782875870896
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
2305.16396v1.pdf accesso aperto Tipologia: Accepted version (post-print) Licenza: Solo Lettura Dimensione 545.27 kB Formato Adobe PDF	545.27 kB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11384/148811

Citazioni

ND

ND

ND

ND

social impact