Data-driven Economic NMPC using Reinforcement Learning

IRIS

Reinforcement Learning (RL) is a powerful tool to perform data-driven optimalcontrol without relying on a model of the system. However, RL struggles toprovide hard guarantees on the behavior of the resulting control scheme. Incontrast, Nonlinear Model Predictive Control (NMPC) and Economic NMPC (ENMPC)are standard tools for the closed-loop optimal control of complex systems withconstraints and limitations, and benefit from a rich theory to assess theirclosed-loop behavior. Unfortunately, the performance of (E)NMPC hinges on thequality of the model underlying the control scheme. In this paper, we show thatan (E)NMPC scheme can be tuned to deliver the optimal policy of the real systemeven when using a wrong model. This result also holds for real systems havingstochastic dynamics. This entails that ENMPC can be used as a new type offunction approximator within RL. Furthermore, we investigate our results in thecontext of ENMPC and formally connect them to the concept of dissipativity,which is central for the ENMPC stability. Finally, we detail how these resultscan be used to deploy classic RL tools for tuning (E)NMPC schemes. We applythese tools on both a classical linear MPC setting and a standard nonlinearexample from the ENMPC literature.

Data-driven Economic NMPC using Reinforcement Learning

Sébastien Gros;Mario Zanon

2020

Abstract

Reinforcement Learning (RL) is a powerful tool to perform data-driven optimalcontrol without relying on a model of the system. However, RL struggles toprovide hard guarantees on the behavior of the resulting control scheme. Incontrast, Nonlinear Model Predictive Control (NMPC) and Economic NMPC (ENMPC)are standard tools for the closed-loop optimal control of complex systems withconstraints and limitations, and benefit from a rich theory to assess theirclosed-loop behavior. Unfortunately, the performance of (E)NMPC hinges on thequality of the model underlying the control scheme. In this paper, we show thatan (E)NMPC scheme can be tuned to deliver the optimal policy of the real systemeven when using a wrong model. This result also holds for real systems havingstochastic dynamics. This entails that ENMPC can be used as a new type offunction approximator within RL. Furthermore, we investigate our results in thecontext of ENMPC and formally connect them to the concept of dissipativity,which is central for the ENMPC stability. Finally, we detail how these resultscan be used to deploy classic RL tools for tuning (E)NMPC schemes. We applythese tools on both a classical linear MPC setting and a standard nonlinearexample from the ENMPC literature.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2020
			
	Rivista
	
				IEEE TRANSACTIONS ON AUTOMATIC CONTROL
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
RLForNMPC_V3.pdf accesso aperto Tipologia: Documento in Pre-print Licenza: Creative commons Dimensione 977.18 kB Formato Adobe PDF Visualizza/Apri	977.18 kB	Adobe PDF	Visualizza/Apri
08701462.pdf non disponibili Tipologia: Versione Editoriale (PDF) Licenza: Nessuna licenza Dimensione 2.05 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	2.05 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11771/12559

Citazioni

ND

176

social impact