TY - GEN
T1 - Integrated and Adaptive Guidance and Control for Endoatmospheric Missiles via Reinforcement Meta-Learning
AU - Gaudet, Brian
AU - Furfaro, Roberto
N1 - Publisher Copyright:
© 2023, American Institute of Aeronautics and Astronautics Inc, AIAA. All rights reserved.
PY - 2023
Y1 - 2023
N2 - We apply a reinforcement meta-learning framework to optimize an integrated and adaptive guidance and flight control system for an air-to-air missile. The system is implemented as a policy that maps navigation system outputs directly to commanded rates of change for the missile’s control surface deflections. The system induces intercept trajectories against a maneuvering target that satisfy control constraints on fin deflection angles, and path constraints on look angle and load. We test the optimized system in a six degrees-of-freedom simulator that includes a non-linear radome model and a strapdown seeker model, and demonstrate that the system adapts to both a large flight envelope and off-nominal flight conditions including perturbation of aerodynamic coefficient parameters and flexible body dynamics. Moreover, we find that the system is robust to the parasitic attitude loop induced by radome refraction and imperfect seeker stabilization. We compare our system’s performance to a longitudinal model of proportional navigation coupled with a three loop autopilot, and find that our system outperforms this benchmark by a large margin. Additional experiments investigate the impact of removing the recurrent layer from the policy and value function networks, and performance with an infrared seeker.
AB - We apply a reinforcement meta-learning framework to optimize an integrated and adaptive guidance and flight control system for an air-to-air missile. The system is implemented as a policy that maps navigation system outputs directly to commanded rates of change for the missile’s control surface deflections. The system induces intercept trajectories against a maneuvering target that satisfy control constraints on fin deflection angles, and path constraints on look angle and load. We test the optimized system in a six degrees-of-freedom simulator that includes a non-linear radome model and a strapdown seeker model, and demonstrate that the system adapts to both a large flight envelope and off-nominal flight conditions including perturbation of aerodynamic coefficient parameters and flexible body dynamics. Moreover, we find that the system is robust to the parasitic attitude loop induced by radome refraction and imperfect seeker stabilization. We compare our system’s performance to a longitudinal model of proportional navigation coupled with a three loop autopilot, and find that our system outperforms this benchmark by a large margin. Additional experiments investigate the impact of removing the recurrent layer from the policy and value function networks, and performance with an infrared seeker.
UR - http://www.scopus.com/inward/record.url?scp=85200382641&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85200382641&partnerID=8YFLogxK
U2 - 10.2514/6.2023-2638
DO - 10.2514/6.2023-2638
M3 - Conference contribution
AN - SCOPUS:85200382641
SN - 9781624106996
T3 - AIAA SciTech Forum and Exposition, 2023
BT - AIAA SciTech Forum and Exposition, 2023
PB - American Institute of Aeronautics and Astronautics Inc, AIAA
T2 - AIAA SciTech Forum and Exposition, 2023
Y2 - 23 January 2023 through 27 January 2023
ER -