TY - GEN
T1 - Robust spacecraft hovering near small bodies in environments with unknown dynamics using reinforcement learning
AU - Gaudet, Brian
AU - Furfaro, Roberto
PY - 2012
Y1 - 2012
N2 - Autonomous close proximity operations (including hovering and landing) in the lowgravity environment exhibited by asteroids are particularly challenging. Current approaches to this problem require knowledge of the environmental dynamics in the asteroid's vicinity. This knowledge is costly, both in terms of time and money, to acquire. This paper uses reinforcement learning (RL) to develop a novel non-linear hovering controller with sufficient robustness to allow precision hovering in unknown environments, limited only by the maximum thrust requirements imposed by the environment. We demonstrate the robustness of the controller by simulating precision hovering in multiple environments that were unknown during the policy optimization. The environments are modeled using non-uniform rotation and a non-uniform gravity field. Simulations were also run using a shape model of the asteroid Itokawa. Performance is compared to that of an RL derived optimal linear PD controller and an LQR controller. Since the hovering controller requires an estimate of the spacecraft's state relative to a landmark on the asteroid's surface, we also introduce an optical seeker based navigation approach that accurately estimates the spacecraft's current state using only a single camera and laser range finder.
AB - Autonomous close proximity operations (including hovering and landing) in the lowgravity environment exhibited by asteroids are particularly challenging. Current approaches to this problem require knowledge of the environmental dynamics in the asteroid's vicinity. This knowledge is costly, both in terms of time and money, to acquire. This paper uses reinforcement learning (RL) to develop a novel non-linear hovering controller with sufficient robustness to allow precision hovering in unknown environments, limited only by the maximum thrust requirements imposed by the environment. We demonstrate the robustness of the controller by simulating precision hovering in multiple environments that were unknown during the policy optimization. The environments are modeled using non-uniform rotation and a non-uniform gravity field. Simulations were also run using a shape model of the asteroid Itokawa. Performance is compared to that of an RL derived optimal linear PD controller and an LQR controller. Since the hovering controller requires an estimate of the spacecraft's state relative to a landmark on the asteroid's surface, we also introduce an optical seeker based navigation approach that accurately estimates the spacecraft's current state using only a single camera and laser range finder.
UR - http://www.scopus.com/inward/record.url?scp=84880842068&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84880842068&partnerID=8YFLogxK
U2 - 10.2514/6.2012-5072
DO - 10.2514/6.2012-5072
M3 - Conference contribution
AN - SCOPUS:84880842068
SN - 9781624101823
T3 - AIAA/AAS Astrodynamics Specialist Conference 2012
BT - AIAA/AAS Astrodynamics Specialist Conference 2012
T2 - AIAA/AAS Astrodynamics Specialist Conference 2012
Y2 - 13 August 2012 through 16 August 2012
ER -