Bregman Finito/MISO for nonconvex regularized finite sum minimization without Lipschitz gradient continuity

IRIS

We introduce two algorithms for nonconvex regularized finite sum minimization, where typical Lipschitz differentiability assumptions are relaxed to the notion of relative smoothness. The first one is a Bregman extension of Finito/MISO [A. Defazio and J. Domke, Proc. Mach. Learn. Res. (PMLR), 32 (2014), pp. 1125-1133; J. Mairal, SIAM J. Optim., 25 (2015), pp. 829-855], studied for fully nonconvex problems when the sampling is randomized, or under convexity of the nonsmooth term when it is essentially cyclic. The second algorithm is a low-memory variant, in the spirit of SVRG [R. Johnson and T. Zhang, Advances in Neural Information Processing Systems 26, Curran Associates, Red Hook, NY, 2013, pp. 315-323] and SARAH [L. M. Nguyen et al., Proc. Mach. Learn. Res. (PMLR), 70 (2017), pp. 2613-2621], that also allows for fully nonconvex formulations. Our analysis is made remarkably simple by employing a Bregman-Moreau envelope as the Lyapunov function. In the randomized case, linear convergence is established when the cost function is strongly convex, yet with no convexity requirements on the individual functions in the sum. For the essentially cyclic and low-memory variants, global and linear convergence results are established when the cost function satisfies the Kurdyka- Lojasiewicz property.

Bregman Finito/MISO for nonconvex regularized finite sum minimization without Lipschitz gradient continuity

Latafat P.;Themelis A.;Ahookhosh M.;Patrinos P.

2022

Abstract

We introduce two algorithms for nonconvex regularized finite sum minimization, where typical Lipschitz differentiability assumptions are relaxed to the notion of relative smoothness. The first one is a Bregman extension of Finito/MISO [A. Defazio and J. Domke, Proc. Mach. Learn. Res. (PMLR), 32 (2014), pp. 1125-1133; J. Mairal, SIAM J. Optim., 25 (2015), pp. 829-855], studied for fully nonconvex problems when the sampling is randomized, or under convexity of the nonsmooth term when it is essentially cyclic. The second algorithm is a low-memory variant, in the spirit of SVRG [R. Johnson and T. Zhang, Advances in Neural Information Processing Systems 26, Curran Associates, Red Hook, NY, 2013, pp. 315-323] and SARAH [L. M. Nguyen et al., Proc. Mach. Learn. Res. (PMLR), 70 (2017), pp. 2613-2621], that also allows for fully nonconvex formulations. Our analysis is made remarkably simple by employing a Bregman-Moreau envelope as the Lyapunov function. In the randomized case, linear convergence is established when the cost function is strongly convex, yet with no convexity requirements on the individual functions in the sum. For the essentially cyclic and low-memory variants, global and linear convergence results are established when the cost function satisfies the Kurdyka- Lojasiewicz property.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Rivista
	
				SIAM JOURNAL ON OPTIMIZATION
			
	Parole chiave
	
				Bregman- Moreau envelope
incremental aggregated algorithms
KL inequality
nonsmooth nonconvex optimization
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
_2.pdf non disponibili Tipologia: Versione Editoriale (PDF) Licenza: Copyright dell'editore Dimensione 814.09 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	814.09 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11771/32178

Citazioni

ND

10

social impact