This paper proposes a method for learning optimal state estimators from input/output data for linear discrete-time stochastic systems. We show that this problem can be expressed in the reinforcement learning framework, suitably adapted to the peculiar problem structure. In particular, we introduce the specific Bellman equation for the state estimation problem and use temporal differences to solve it. We show in simulations that the resulting data-driven method for state estimation converges to the optimal observer.

Linear Observer Learning by Temporal Difference

Menchetti, Stefano;Zanon, Mario;Bemporad, Alberto
2022-01-01

Abstract

This paper proposes a method for learning optimal state estimators from input/output data for linear discrete-time stochastic systems. We show that this problem can be expressed in the reinforcement learning framework, suitably adapted to the peculiar problem structure. In particular, we introduce the specific Bellman equation for the state estimation problem and use temporal differences to solve it. We show in simulations that the resulting data-driven method for state estimation converges to the optimal observer.
2022
978-1-6654-6761-2
Linear systems , Adaptation models, Stochastic systems, Reinforcement learning, Observers, Mathematical models, Nonlinear systems
File in questo prodotto:
File Dimensione Formato  
FinalVersion.pdf

accesso aperto

Tipologia: Documento in Post-print
Licenza: Creative commons
Dimensione 327.04 kB
Formato Adobe PDF
327.04 kB Adobe PDF Visualizza/Apri
Linear_Observer_Learning_by_Temporal_Difference.pdf

non disponibili

Tipologia: Versione Editoriale (PDF)
Licenza: Copyright dell'editore
Dimensione 1.06 MB
Formato Adobe PDF
1.06 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11771/23658
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
social impact