Alberta Machine Intelligence Institute

AI Seminar: Generalization in Monitored Markov Decision Processes (Mon-MDPs) Montaser Mohammedalamen

Published

Feb 11, 2025

The AI Seminar is a weekly meeting at the University of Alberta where researchers interested in artificial intelligence (AI) can share their research. Presenters include both local speakers from the University of Alberta and visitors from other institutions. Topics can be related in any way to artificial intelligence, from foundational theoretical work to innovative applications of AI techniques to new fields and problems.

Abstract:

Reinforcement learning (RL) often assumes rewards are always observable to the agent, but some real-world scenarios challenge this assumption. The monitored MDP (Mon-MDP) framework models interactions where rewards are not always observable. Previous works on Mon-MDPs focused on tabular cases. This work explores Mon-MDPs in non-tabular using function approximation (FA) and investigates the challenges involved, enabling agents to generalize from monitored to unmonitored environment states. However, FA can cause over-generalization, where agents incorrectly extrapolate rewards. To address this, we propose a cautious learning method incorporating reward uncertainty to avoid undesirable outcomes."

Presenter Bio:

Montaser Mohammedalamen is a PhD candidate advised by Dr. Michael Bowling, exploring how AI systems can learn in environments where rewards are not always observable. His research focuses on designing autonomous agents that can act cautiously in uncertain scenarios, contributing to advancements in reinforcement learning for partially observable settings. Before starting his PhD, Montaser worked as an AI engineer at SonyAI, where he was part of a team developing multi-agent robotic systems. His work included training agents using self-play and goal-conditioned reinforcement learning, transferring learned behaviors from simulations to real-world settings, and integrating them with vision systems and robot control methods. Montaser is passionate about bridging theoretical research with practical applications to create adaptive and intelligent systems for complex environments."