TY - JOUR
T1 - Reinforcement metalearning for interception of maneuvering exoatmospheric targets with parasitic attitude loop
AU - Gaudet, Brian
AU - Furfaro, Roberto
AU - Linares, Richard
AU - Scorsoglio, Andrea
N1 - Publisher Copyright:
© 2020 by the American Institute of Aeronautics and Astronautics, Inc. All rights reserved.
PY - 2021
Y1 - 2021
N2 - This Paper uses Reinforcement Meta-Learning to optimize an adaptive integrated guidance, navigation, and control system suitable for exoatmospheric interception of a maneuvering target. The system maps observations consisting of strapdown seeker angles and rate gyroscope measurements directly to thruster on/off commands. Using a high fidelity six-degree-of-freedom simulator, this Paper demonstrates that the optimized policy can adapt to parasitic effects including seeker angle measurement lag, thruster control lag, the parasitic attitude loop resulting from scale factor errors and Gaussian noise on angle and rotational velocity measurements, and a time-varying center of mass caused by fuel consumption and slosh. Importantly, the optimized policy gives good performance over a wide range of challenging target maneuvers. Unlike previous work that enhances range observability by inducing line of sight oscillations, this Paper’s system is optimized to use only measurements available from the seeker and rate gyros. Through extensive Monte Carlo simulation of randomized exoatmospheric interception scenarios, this Paper demonstrates that the optimized policy gives performance close to that of augmented proportional navigation with perfect knowledge of the full engagement state. The optimized system is computationally efficient and requires minimal memory and should be compatible with today’s flight processors.
AB - This Paper uses Reinforcement Meta-Learning to optimize an adaptive integrated guidance, navigation, and control system suitable for exoatmospheric interception of a maneuvering target. The system maps observations consisting of strapdown seeker angles and rate gyroscope measurements directly to thruster on/off commands. Using a high fidelity six-degree-of-freedom simulator, this Paper demonstrates that the optimized policy can adapt to parasitic effects including seeker angle measurement lag, thruster control lag, the parasitic attitude loop resulting from scale factor errors and Gaussian noise on angle and rotational velocity measurements, and a time-varying center of mass caused by fuel consumption and slosh. Importantly, the optimized policy gives good performance over a wide range of challenging target maneuvers. Unlike previous work that enhances range observability by inducing line of sight oscillations, this Paper’s system is optimized to use only measurements available from the seeker and rate gyros. Through extensive Monte Carlo simulation of randomized exoatmospheric interception scenarios, this Paper demonstrates that the optimized policy gives performance close to that of augmented proportional navigation with perfect knowledge of the full engagement state. The optimized system is computationally efficient and requires minimal memory and should be compatible with today’s flight processors.
UR - http://www.scopus.com/inward/record.url?scp=85102799938&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85102799938&partnerID=8YFLogxK
U2 - 10.2514/1.A34841
DO - 10.2514/1.A34841
M3 - Article
AN - SCOPUS:85102799938
SN - 0022-4650
VL - 58
SP - 386
EP - 399
JO - Journal of Spacecraft and Rockets
JF - Journal of Spacecraft and Rockets
IS - 2
ER -