We introduce two algorithms for nonconvex regularized finite sum minimization, where typical Lipschitz differentiability assumptions are relaxed to the notion of relative smoothness. The first one is a Bregman extension of Finito/MISO [A. Defazio and J. Domke, Proc. Mach. Learn. Res. (PMLR), 32 (2014), pp. 1125-1133; J. Mairal, SIAM J. Optim., 25 (2015), pp. 829-855], studied for fully nonconvex problems when the sampling is randomized, or under convexity of the nonsmooth term when it is essentially cyclic. The second algorithm is a low-memory variant, in the spirit of SVRG [R. Johnson and T. Zhang, Advances in Neural Information Processing Systems 26, Curran Associates, Red Hook, NY, 2013, pp. 315-323] and SARAH [L. M. Nguyen et al., Proc. Mach. Learn. Res. (PMLR), 70 (2017), pp. 2613-2621], that also allows for fully nonconvex formulations. Our analysis is made remarkably simple by employing a Bregman-Moreau envelope as the Lyapunov function. In the randomized case, linear convergence is established when the cost function is strongly convex, yet with no convexity requirements on the individual functions in the sum. For the essentially cyclic and low-memory variants, global and linear convergence results are established when the cost function satisfies the Kurdyka- Lojasiewicz property.

Bregman Finito/MISO for nonconvex regularized finite sum minimization without Lipschitz gradient continuity

Latafat P.
;
2022-01-01

Abstract

We introduce two algorithms for nonconvex regularized finite sum minimization, where typical Lipschitz differentiability assumptions are relaxed to the notion of relative smoothness. The first one is a Bregman extension of Finito/MISO [A. Defazio and J. Domke, Proc. Mach. Learn. Res. (PMLR), 32 (2014), pp. 1125-1133; J. Mairal, SIAM J. Optim., 25 (2015), pp. 829-855], studied for fully nonconvex problems when the sampling is randomized, or under convexity of the nonsmooth term when it is essentially cyclic. The second algorithm is a low-memory variant, in the spirit of SVRG [R. Johnson and T. Zhang, Advances in Neural Information Processing Systems 26, Curran Associates, Red Hook, NY, 2013, pp. 315-323] and SARAH [L. M. Nguyen et al., Proc. Mach. Learn. Res. (PMLR), 70 (2017), pp. 2613-2621], that also allows for fully nonconvex formulations. Our analysis is made remarkably simple by employing a Bregman-Moreau envelope as the Lyapunov function. In the randomized case, linear convergence is established when the cost function is strongly convex, yet with no convexity requirements on the individual functions in the sum. For the essentially cyclic and low-memory variants, global and linear convergence results are established when the cost function satisfies the Kurdyka- Lojasiewicz property.
2022
Bregman- Moreau envelope
incremental aggregated algorithms
KL inequality
nonsmooth nonconvex optimization
File in questo prodotto:
File Dimensione Formato  
_2.pdf

non disponibili

Tipologia: Versione Editoriale (PDF)
Licenza: Copyright dell'editore
Dimensione 814.09 kB
Formato Adobe PDF
814.09 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11771/32178
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
social impact