Rupam Mahmood , Streaming Deep RL, Upper Bound 2025

Published

Jul 10, 2025

For years, a "stream barrier" made deep reinforcement learning unstable without replaying large batches of data. In this Upper Bound 2025 session, Amii Fellow and Canada CIFAR AI Chair Rupam Mahmood reveals a recent breakthrough that solves this long-standing problem.

Learn about Streaming Deep RL: a paradigm where an AI agent learns from each experience exactly once, as it happens. This approach mirrors natural intelligence and is orders of magnitude more compute-efficient. Learn how novel optimizers and normalization techniques make this possible, unlocking time-based scaling and paving the way for ubiquitous, on-device AI that can continuously adapt to our world without remote clusters.

Upper Bound 2025 is Amii's annual artificial intelligence conference, held in Edmonton, Alberta, Canada, bringing together researchers, industry, and policymakers. The conference focuses on accelerating AI excellence and innovation for good, emphasizing AI for critical infrastructure, health, industrial operations, responsible AI, and AI Literacy.

Rupam Mahmood , Streaming Deep RL, Upper Bound 2025

Published

Stay in the loop

Business Solutions

Courses & Literacy

Research & Talent

About Amii

Get Involved