Research Post

The Effect of Q-function Reuse on the Total Regret of Tabular, Model-Free, Reinforcement Learning


Some reinforcement learning methods suffer from high sample complexity causing them to not be practical in real-world situations. Q-function reuse, a transfer learning method, is one way to reduce the sample complexity of learning, potentially improving usefulness of existing algorithms. Prior work has shown the empirical effectiveness of Q-function reuse for various environments when applied to model-free algorithms. To the best of our knowledge, there has been no theoretical work showing the regret of Q-function reuse when applied to the tabular, model-free setting. We aim to bridge the gap between theoretical and empirical work in Q-function reuse by providing some theoretical insights on the effectiveness of Q-function reuse when applied to the Q-learning with UCB-Hoeffding algorithm. Our main contribution is showing that in a specific case if Q-function reuse is applied to the Q-learning with UCB-Hoeffding algorithm it has a regret that is independent of the state or action space. We also provide empirical results supporting our theoretical findings.

Latest Research Papers

Connect with the community

Get involved in Alberta's growing AI ecosystem! Speaker, sponsorship, and letter of support requests welcome.

Explore training and advanced education

Curious about study options under one of our researchers? Want more information on training opportunities?

Harness the potential of artificial intelligence

Let us know about your goals and challenges for AI adoption in your business. Our Investments & Partnerships team will be in touch shortly!