The Tea Time Talks 2021: Week Two

The Tea Time Talks are back! Throughout the summer, take in 20-minute talks on early-stage ideas, prospective research and technical topics delivered by students, faculty and guests. Presented by Amii and the RLAI Lab at the University of Alberta, the talks are a relaxed and informal way of hearing leaders in AI discuss future lines of research they may explore.

Watch select talks from the second week of the series now:

Michael Bowling: Hindsight Rationality

Abstract: In this talk, Michael Bowling looks at some of the often unstated principles common in multiagent learning research, suggesting that they may be responsible for holding us back. And more importantly, might be holding back more than just multiagent. In response, he offers an alternative set of principles, which leads to the view of hindsight rationality, rooted in online learning (and connected to correlated equilibria). He questions beloved approaches of train-then-test, and the focus on evaluating artifacts, with a future-looking lens and comparison to optimal. Replacing them instead with a single-lifetime and a focus on evaluating behaviour with a hindsight lens and comparison to targeted deviations of behavior. This talk is the culmination of a year-long collaboration that introduces an alternative to Nash equilibria (with papers in AAAI and ICML this year). Michael only cursorily touches on the technical contributions of those papers, instead focusing on the more philosophical principles. View the papers if you want to dig deeper: Hindsight and Sequential Rationality of Correlated Play & Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games

Patrick Pilarski: Constructivism in Tightly Coupled Human-Machine Interfaces

Abstract: The objectives of Patrick Pilarski's talk are to: 1) Define "constructivism" and "tightly coupled" in the context of human-machine interfaces (specifically the setting of neuroprostheses); 2) Propose that for maximum potential, tightly coupled interfaces should be partially or fully constructivist; 3) Give concrete examples of how this perspective leads to beneficial properties in tightly coupled interactions, drawn from his past 10 years of work on constructing predictions and state in upper-limb prosthetic interfaces.

Rupam Mahmood: New Forms of Policy Gradients for Model-free Estimation

Abstract: Policy gradient methods are a natural choice for learning a parameterized policy, especially for continuous actions, in a model-free way. These methods update policy parameters with stochastic gradient descent by estimating the gradient of a policy objective. Many of these methods can be derived from or connected to a well-known policy gradient theorem that writes the true gradient in the form of the gradient of the action likelihood, which is suitable for model-free estimation. In this talk, Rupam Mahmood revisits this theorem and looks for other forms of writing the true gradient that may give rise to new classes of policy gradient methods.

Like what you’re learning here? Take a deeper dive into the world of RL with the Reinforcement Learning Specialization, offered by the University of Alberta and Amii. Taught by Martha White and Adam White, this specialization explores how RL solutions help solve real-world problems through trial-and-error interaction, showing learners how to implement a complete RL solution from beginning to end. Enroll in this specialization now!

Latest News Articles

Connect with the community

Get involved in Alberta's growing AI ecosystem! Speaker, sponsorship, and letter of support requests welcome.

Explore training and advanced education

Curious about study options under one of our researchers? Want more information on training opportunities?

Harness the potential of artificial intelligence

Let us know about your goals and challenges for AI adoption in your business. Our Investments & Partnerships team will be in touch shortly!