TY - JOUR
T1 - Terminal Adaptive Guidance for Autonomous Hypersonic Strike Weapons via Reinforcement Metalearning
AU - Gaudet, Brian
AU - Furfaro, Roberto
N1 - Publisher Copyright:
© 2022 by the American Institute of Aeronautics and Astronautics, Inc. All rights reserved.
PY - 2023/1
Y1 - 2023/1
N2 - An adaptive guidance system suitable for the terminal phase trajectory of a hypersonic strike weapon is optimized using reinforcement meta learning. The guidance system maps observations directly to commanded bank angle, angle of attack, and sideslip angle rates. Importantly, the observations are directly measurable from radar seeker outputs with minimal processing. The optimization framework implements a shaping reward that minimizes the line-of-sight rotation rate, with a terminal reward given if the agent satisfies path constraints and meets terminal accuracy and speed criteria. This paper shows that the guidance system can adapt to off-nominal flight conditions, including perturbation of aerodynamic coefficient parameters, actuator failure scenarios, sensor scale factor errors, and actuator lag, while satisfying heating rate, dynamic pressure, and load path constraints, as well as a minimum impact speed constraint. This paper demonstrates precision strike capability against a maneuvering ground target and the ability to divert to a new target, the latter being important to maximize strike effectiveness for a group of hypersonic strike weapons. Moreover, this paper demonstrates a threat evasion strategy against interceptors with limited midcourse correction capability, where the hypersonic strike weapon implements multiple diverts to alternate targets, with the last divert to the actual target.
AB - An adaptive guidance system suitable for the terminal phase trajectory of a hypersonic strike weapon is optimized using reinforcement meta learning. The guidance system maps observations directly to commanded bank angle, angle of attack, and sideslip angle rates. Importantly, the observations are directly measurable from radar seeker outputs with minimal processing. The optimization framework implements a shaping reward that minimizes the line-of-sight rotation rate, with a terminal reward given if the agent satisfies path constraints and meets terminal accuracy and speed criteria. This paper shows that the guidance system can adapt to off-nominal flight conditions, including perturbation of aerodynamic coefficient parameters, actuator failure scenarios, sensor scale factor errors, and actuator lag, while satisfying heating rate, dynamic pressure, and load path constraints, as well as a minimum impact speed constraint. This paper demonstrates precision strike capability against a maneuvering ground target and the ability to divert to a new target, the latter being important to maximize strike effectiveness for a group of hypersonic strike weapons. Moreover, this paper demonstrates a threat evasion strategy against interceptors with limited midcourse correction capability, where the hypersonic strike weapon implements multiple diverts to alternate targets, with the last divert to the actual target.
UR - http://www.scopus.com/inward/record.url?scp=85147044294&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85147044294&partnerID=8YFLogxK
U2 - 10.2514/1.A35396
DO - 10.2514/1.A35396
M3 - Article
AN - SCOPUS:85147044294
SN - 0022-4650
VL - 60
SP - 286
EP - 298
JO - Journal of Spacecraft and Rockets
JF - Journal of Spacecraft and Rockets
IS - 1
ER -