News
The Tea Time Talks are back! Throughout the summer, take in 20-minute talks on early-stage ideas, prospective research and technical topics delivered by students, faculty and guests. Presented by Amii and the RLAI Lab at the University of Alberta, the talks are a relaxed and informal way of hearing leaders in AI discuss future lines of research they may explore.
Watch select talks from the second week of the series now:
Abstract: In this talk, Michael Bowling looks at some of the often unstated principles common in multiagent learning research, suggesting that they may be responsible for holding us back. And more importantly, might be holding back more than just multiagent. In response, he offers an alternative set of principles, which leads to the view of hindsight rationality, rooted in online learning (and connected to correlated equilibria). He questions beloved approaches of train-then-test, and the focus on evaluating artifacts, with a future-looking lens and comparison to optimal. Replacing them instead with a single-lifetime and a focus on evaluating behaviour with a hindsight lens and comparison to targeted deviations of behavior. This talk is the culmination of a year-long collaboration that introduces an alternative to Nash equilibria (with papers in AAAI and ICML this year). Michael only cursorily touches on the technical contributions of those papers, instead focusing on the more philosophical principles. View the papers if you want to dig deeper: Hindsight and Sequential Rationality of Correlated Play & Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games
Abstract: The objectives of Patrick Pilarski's talk are to: 1) Define "constructivism" and "tightly coupled" in the context of human-machine interfaces (specifically the setting of neuroprostheses); 2) Propose that for maximum potential, tightly coupled interfaces should be partially or fully constructivist; 3) Give concrete examples of how this perspective leads to beneficial properties in tightly coupled interactions, drawn from his past 10 years of work on constructing predictions and state in upper-limb prosthetic interfaces.
Abstract: Policy gradient methods are a natural choice for learning a parameterized policy, especially for continuous actions, in a model-free way. These methods update policy parameters with stochastic gradient descent by estimating the gradient of a policy objective. Many of these methods can be derived from or connected to a well-known policy gradient theorem that writes the true gradient in the form of the gradient of the action likelihood, which is suitable for model-free estimation. In this talk, Rupam Mahmood revisits this theorem and looks for other forms of writing the true gradient that may give rise to new classes of policy gradient methods.
Like what you’re learning here? Take a deeper dive into the world of RL with the Reinforcement Learning Specialization, offered by the University of Alberta and Amii. Taught by Martha White and Adam White, this specialization explores how RL solutions help solve real-world problems through trial-and-error interaction, showing learners how to implement a complete RL solution from beginning to end. Enroll in this specialization now!
Apr 8th 2024
News
Amii Fellows share tips on how to make the most of your conference experience.
Mar 26th 2024
News
In this month's episode, Alona talks about how ChatGPT changed the public’s perception of what AI language models can do, instantly making most previous benchmarks seem out of date, and the excitement and intensity of working in a fast-moving field like AI.
Mar 18th 2024
News
Google.org announces new research grants to support critical AI research in Canada focused on areas such as sustainability and the responsible development of AI. The grant will provide a total of $2.7 million in grant funding to Amii, the Canadian Institute for Advanced Research (CIFAR) and the International Center of Expertise of Montreal on AI (CEIMIA).
Looking to build AI capacity? Need a speaker at your event?