Convex Q-Learning: Theory and Applications
Published in University of Florida ProQuest Dissertations Publishing, 2023.30424301., 2013
Recommended citation: Lu, F., 2023. Convex Q-Learning: Theory and Applications (Doctoral dissertation, University of Florida).
Reinforcement learning has proven to be a highly effective technique for decision-making in complex and dynamic environments. One of the most widely used algorithms in this field is Q-learning, which enables agents to learn a policy by iteratively updating estimates of the Q-function. However, Q-learning has its limitations, particularly in handling high-dimensional state spaces. To address these challenges, recent research effort has focused on developing new formulations of reinforcement learning algorithms that are more efficient and amenable to analysis. One promising approach is to design algorithms based on the linear programming formulation of optimal control, which leverages the reliability and speed of convex optimization. By using general linear function approximation methods and the LP approach to dynamic programming, a new class of Q-learning algorithms called convex Q-learning has been proposed in this dissertation, along with a sequence of theoretical results. The dissertation also presents a theoretical analysis of the convergence properties of convex Q-learning, demonstrating that the algorithm converges to the optimal policy with high probability. It explores the practical applications of convex Q-learning in various domains, including robotics, inventory control, and resource allocation for DERs.
Recommended citation: Lu, F., 2023. Convex Q-Learning: Theory and Applications (Doctoral dissertation, University of Florida).