Optimal Q-laws via reinforcement learning with guaranteed stability

Harry Holt, Roberto Armellin, Nicola Baresi, Yoshi Hashida, Andrea Turconi, Andrea Scorsoglio, Roberto Furfaro

Research output: Contribution to journalArticlepeer-review

4 Scopus citations


Closed-loop feedback-driven control laws can be used to solve low-thrust many-revolution trajectory design and guidance problems with minimal computational cost. Lyapunov-based control laws offer the benefits of increased stability whilst their optimality can be increased by tuning their parameters. In this paper, a reinforcement learning framework is used to make the parameters of the Lyapunov-based Q-law state-dependent, increasing its optimality. The Jacobian of these state-dependent parameters is available analytically and, unlike in other optimisation approaches, can be used to enforce stability throughout the transfer. The results focus on GTO–GEO and LEO–GEO transfers in Keplerian dynamics, including the effects of eclipses. The impact of the network architecture on the behaviour is investigated for both time- and mass-optimal transfers. Robustness to navigation errors and thruster misalignment is demonstrated using Monte Carlo analyses. The resulting approach offers potential for on-board autonomous transfers and orbit reconfiguration.

Original languageEnglish (US)
Pages (from-to)511-528
Number of pages18
JournalActa Astronautica
StatePublished - Oct 2021


  • Low-thrust
  • Lyapunov control
  • Reinforcement learning
  • Stability
  • State-dependent

ASJC Scopus subject areas

  • Aerospace Engineering


Dive into the research topics of 'Optimal Q-laws via reinforcement learning with guaranteed stability'. Together they form a unique fingerprint.

Cite this