TY - JOUR
T1 - Six degree-of-freedom body-fixed hovering over unmapped asteroids via LIDAR altimetry and reinforcement meta-learning
AU - Gaudet, Brian
AU - Linares, Richard
AU - Furfaro, Roberto
N1 - Publisher Copyright:
© 2020 IAA
PY - 2020/7
Y1 - 2020/7
N2 - We optimize a six degrees of freedom hovering policy using reinforcement meta-learning. The policy maps flash LIDAR measurements directly to on/off spacecraft body-frame thrust commands, allowing hovering at a fixed position and attitude in the asteroid body-fixed reference frame. Importantly, the policy does not require position and velocity estimates, and can operate in environments with unknown dynamics, and without an asteroid shape model or navigation aids. Indeed, during optimization the agent is confronted with a new randomly generated asteroid for each episode, insuring that it does not learn an asteroid's shape, texture, or environmental dynamics. This allows the deployed policy to generalize well to novel asteroid characteristics, which we demonstrate in our experiments. Moreover, our experiments show that the optimized policy adapts to actuator failure and sensor noise. Although the policy is optimized using randomly generated synthetic asteroids, it is tested on two shape models from actual asteroids: Bennu and Itokawa. We find that the policy generalizes well to these shape models. The hovering controller has the potential to simplify mission planning by allowing asteroid body-fixed hovering immediately upon the spacecraft's arrival to an asteroid. This in turn simplifies shape model generation and allows resource mapping via remote sensing immediately upon arrival at the target asteroid.
AB - We optimize a six degrees of freedom hovering policy using reinforcement meta-learning. The policy maps flash LIDAR measurements directly to on/off spacecraft body-frame thrust commands, allowing hovering at a fixed position and attitude in the asteroid body-fixed reference frame. Importantly, the policy does not require position and velocity estimates, and can operate in environments with unknown dynamics, and without an asteroid shape model or navigation aids. Indeed, during optimization the agent is confronted with a new randomly generated asteroid for each episode, insuring that it does not learn an asteroid's shape, texture, or environmental dynamics. This allows the deployed policy to generalize well to novel asteroid characteristics, which we demonstrate in our experiments. Moreover, our experiments show that the optimized policy adapts to actuator failure and sensor noise. Although the policy is optimized using randomly generated synthetic asteroids, it is tested on two shape models from actual asteroids: Bennu and Itokawa. We find that the policy generalizes well to these shape models. The hovering controller has the potential to simplify mission planning by allowing asteroid body-fixed hovering immediately upon the spacecraft's arrival to an asteroid. This in turn simplifies shape model generation and allows resource mapping via remote sensing immediately upon arrival at the target asteroid.
KW - Asteroid missions
KW - Autonomous maneuvers
KW - Hovering artificial intelligence
KW - Reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85082675947&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85082675947&partnerID=8YFLogxK
U2 - 10.1016/j.actaastro.2020.03.026
DO - 10.1016/j.actaastro.2020.03.026
M3 - Article
AN - SCOPUS:85082675947
SN - 0094-5765
VL - 172
SP - 90
EP - 99
JO - Acta Astronautica
JF - Acta Astronautica
ER -