Precision landing on large planetary bodies is a critical technology for future human and robotic exploration of the solar system. Indeed, over the past decade, landing systems for robotic Mars missions have been developed with the specific goal of deploying robotic agents (e.g. rovers, landers) on the Martian surface. In this paper, we proposed a novel algorithm that can generate powered, closedloop trajectories to enforce flight constraints (e.g. no crashing on slope surfaces) while ensuring precision landing. More specifically, we propose a waypointbased ZEM/ZEV algorithm that employs a dynamic programming approach via Value Iteration to determine the best location of the waypoints for a set of constrained landing over large planetary bodies (e.g. Moon and Mars). Here, the Reinforcement Learning (RL) framework is employed to integrate ZEM/ZEV with a waypoint selection policy as function of the current state of the spacecraft during the powered descent phase (i.e. position and velocity). Here, a set of openloop, constrained, fuel-efficient trajectories are numerically computed using pseudo-spectral methods. A set of states from the open-loop optimal trajectories are stored as candidate waypoints. The latter are employed by the ZEM/ZEV algorithm as intermediate targets to steer the spacecraft toward the final target point on the planetary surface. The problem is cast as a Markov Decision Process (MDP) and the resulting dynamics programming problem is solved via generalized policy evaluation to select the next best intermediate target point as function of the previous one. The behavior of the integrated guidance algorithm is evaluated in Mars powered landing scenarios that involve demanding requirements both in landing location and flight path. Both constraints satisfaction and fuel efficiency are analyzed to show the effectiveness of the proposed approach.