Economic MPC of Markov Decision Processes: Dissipativity in undiscounted infinite-horizon optimal control

IRIS

Economic Model Predictive Control (MPC) dissipativity theory is central to discussing the stability of policies resulting from minimizing economic stage costs. In its current form, the dissipativity theory for economic MPC applies to problems based on deterministic dynamics or to very specific classes of stochastic problems, and does not readily extend to generic Markov decision processes. In this paper, we clarify the core reason for this difficulty, and propose a generalization of the economic MPC dissipativity theory that circumvents it. This generalization focuses on undiscounted infinite-horizon problems and is based on nonlinear stage cost functionals, allowing one to discuss the Lyapunov asymptotic stability of policies for Markov decision processes in terms of the probability measures underlying their stochastic dynamics. This theory is illustrated for the stochastic linear quadratic regulator with Gaussian process noise, for which a storage functional can be provided explicitly. For the sake of brevity, we limit our discussion to undiscounted Markov decision processes.

Economic MPC of Markov Decision Processes: Dissipativity in undiscounted infinite-horizon optimal control / Gros, S., Zanon, M.. - In: AUTOMATICA. - ISSN 0005-1098. - 146:(2022), p. 110602. [10.1016/j.automatica.2022.110602]

Economic MPC of Markov Decision Processes: Dissipativity in undiscounted infinite-horizon optimal control

Gros S.;Zanon M.

2022

Abstract

Economic Model Predictive Control (MPC) dissipativity theory is central to discussing the stability of policies resulting from minimizing economic stage costs. In its current form, the dissipativity theory for economic MPC applies to problems based on deterministic dynamics or to very specific classes of stochastic problems, and does not readily extend to generic Markov decision processes. In this paper, we clarify the core reason for this difficulty, and propose a generalization of the economic MPC dissipativity theory that circumvents it. This generalization focuses on undiscounted infinite-horizon problems and is based on nonlinear stage cost functionals, allowing one to discuss the Lyapunov asymptotic stability of policies for Markov decision processes in terms of the probability measures underlying their stochastic dynamics. This theory is illustrated for the stochastic linear quadratic regulator with Gaussian process noise, for which a storage functional can be provided explicitly. For the sake of brevity, we limit our discussion to undiscounted Markov decision processes.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Rivista
	
				AUTOMATICA
			
	Codice OpenAlex
	
				W4296849093
			
	Parole chiave
	
				Markov Decision Processes, Dissipativity for economic MPC, Storage functions, Economic costs
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
mdp_stability.pdf accesso aperto Tipologia: Documento in Pre-print Licenza: Creative commons Dimensione 415.65 kB Formato Adobe PDF Visualizza/Apri	415.65 kB	Adobe PDF	Visualizza/Apri
1-s2.0-S0005109822004642-main.pdf non disponibili Tipologia: Versione Editoriale (PDF) Licenza: Copyright dell'editore Dimensione 755.63 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	755.63 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11771/23660

Citazioni

ND

21

22

social impact