Now that the 2020 Tea Time Talks are on Youtube, you can always have time for tea with Amii and the RLAI Lab! Hosted by Amii’s Chief Scientific Advisory Dr. Richard S. Sutton, these 20-minute talks on technical topics are delivered by students, faculty and guests. The talks are a relaxed and informal way of hearing leaders in AI discuss future lines of research they may explore, with topics ranging from ideas starting to take root to fully-finished projects.
Week eleven of the Tea Time Talks features:
Many supervised learning algorithms are designed to operate under i.i.d. sampling. When those algorithms are applied to problems with nonstationary sampling, they can misbehave -- which is not surprising if one takes time to understand the conditions under which an algorithm's behaviour is (or is not) guaranteed. Dynamical systems analysis offers us some tools to extend those guarantees to certain kinds of nonstationary sampling. This talk exemplifies these ideas in a simple setting: optimizing linear regression models with SGD+momentum under periodic simple nonstationarity.
Multi-step greedy policies have been extensively used in model-based reinforcement learning (RL), both when a model of the environment is available (for example, in the game of Go) and when it is learned. In this talk, Manan presents a paper he co-authored which explores the benefits of multi-step greedy policies in model-free RL, when employed using multi-step dynamic programming algorithms: $\kappa$-Policy Iteration ($\kappa$-PI) and $\kappa$-Value Iteration ($\kappa$-VI). These methods iteratively compute the next policy ($\kappa$-PI) and value function ($\kappa$-VI) by solving a surrogate decision problem with a shaped reward and a smaller discount factor. The authors derive model-free RL algorithms based on $\kappa$-PI and $\kappa$-VI in which the surrogate problem can be solved by any discrete or continuous action RL method, such as DQN and TRPO; and identify the importance of a hyper-parameter that controls the extent to which the surrogate problem is solved and suggest a way to set this parameter. When evaluated on a range of Atari and MuJoCo benchmark tasks, their results indicate that for the right range of $\kappa$, their algorithms outperform DQN and TRPO. This shows that their multi-step greedy algorithms are general enough to be applied over any existing RL algorithm and can significantly improve its performance.
Robin shares some highlights and learnings from a year of interviewing RL researchers on the TalkRL podcast. Additionally, he dives deep into a Pommerman agent he designed.
In 1990, Scott E. Fahlman and Christian Lebiere proposed a constructive neural network architecture -- the cascade-correlation -- as an alternative to training deep neural networks with fixed architectures using backpropagation. Despite showing promising results and spurring several follow up papers, the cascade-correlation is not popular in the deep learning community. In this talk, Juan explores why the cascade-correlation is not popular anymore, in the process presenting several empirical results that demonstrate the performance of the cascade-correlation under several settings and in different domains. He discusses disadvantages of the cascade-correlation that have been found in the literature, but also several extensions that have been proposed to address each of them. He concludes by arguing why the cascade-correlation is worth caring about.
The Tea Time Talks have now concluded for the year, but stay tuned as we will be uploading the final talks next week. In the meantime, you can rewatch or catch up on previous talks on our Youtube playlist.
Nov 25th 2020
On November 19, participants gathered digitally for the AI Meetup, featuring Nirav Raiyani (Research Engineer - Process & Machine Learning at Ntwist) and Jasmine Wang (Founder & CEO of copysmith.ai).
Nov 23rd 2020
Masaf Dawood & Sindhu Adini from SpringML and Chloe Tottem from Google Cloud presented “Compute and Data at Your Fingertips” at the AI Seminar on Nov 13, 2020.
Nov 19th 2020
The Alberta Machine Intelligence Institute (Amii) has launched applications for Supply Chain AI West, an eight-month accelerator focused on empowering startups and early-stage founders to incorporate artificial intelligence (AI) technologies toward creating AI-powered supply chains.
Looking to build AI capacity? Need a speaker at your event?
Get involved in Alberta's growing AI ecosystem! Speaker, sponsorship, and letter of support requests welcome.
Curious about study options under one of our researchers? Want more information on training opportunities?