An Inexact Sequential Quadratic Programming Method for Learning and Control of Recurrent Neural Networks

IRIS

This article considers the two-stage approach to solving a partially observable Markov decision process (POMDP): the identification stage and the (optimal) control stage. We present an inexact sequential quadratic programming framework for recurrent neural network learning (iSQPRL) for solving the identification stage of the POMDP, in which the true system is approximated by a recurrent neural network (RNN) with dynamically consistent overshooting (DCRNN). We formulate the learning problem as a constrained optimization problem and study the quadratic programming (QP) subproblem with a convergence analysis under a restarted Krylov-subspace iterative scheme that implicitly exploits the structure of the associated Karush–Kuhn–Tucker (KKT) subsystem. In the control stage, where a feedforward neural network (FNN) controller is designed on top of the RNN model, we adapt a generalized Gauss–Newton (GGN) algorithm that exploits useful approximations to the curvature terms of the training data and selects its mini-batch step size using a known property of some regularization function. Simulation results are provided to demonstrate the effectiveness of our approach. Authors

An Inexact Sequential Quadratic Programming Method for Learning and Control of Recurrent Neural Networks

A. D. Adeoye;A. Bemporad

2024

Abstract

This article considers the two-stage approach to solving a partially observable Markov decision process (POMDP): the identification stage and the (optimal) control stage. We present an inexact sequential quadratic programming framework for recurrent neural network learning (iSQPRL) for solving the identification stage of the POMDP, in which the true system is approximated by a recurrent neural network (RNN) with dynamically consistent overshooting (DCRNN). We formulate the learning problem as a constrained optimization problem and study the quadratic programming (QP) subproblem with a convergence analysis under a restarted Krylov-subspace iterative scheme that implicitly exploits the structure of the associated Karush–Kuhn–Tucker (KKT) subsystem. In the control stage, where a feedforward neural network (FNN) controller is designed on top of the RNN model, we adapt a generalized Gauss–Newton (GGN) algorithm that exploits useful approximations to the curvature terms of the training data and selects its mini-batch step size using a known property of some regularization function. Simulation results are provided to demonstrate the effectiveness of our approach. Authors

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Rivista
	
				IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
			
	Parole chiave
	
				Gauss–Newton methods, Markov decision processes, numerical optimization, recurrent neural networks (RNNs), reinforcement learning (RL), sequential quadratic programming (SQP)
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
An_Inexact_Sequential_Quadratic_Programming_Method_for_Learning_and_Control_of_Recurrent_Neural_Networks.pdf accesso aperto Descrizione: An Inexact Sequential Quadratic Programming Method for Learning and Control of Recurrent Neural Networks Tipologia: Versione Editoriale (PDF) Licenza: Creative commons Dimensione 8.48 MB Formato Adobe PDF Visualizza/Apri	8.48 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11771/28078

Citazioni

ND

2

social impact