News

# The Tea Time Talks 2020: Week Eleven

Now that the 2020 Tea Time Talks are on Youtube, you can always have time for tea with Amii and the RLAI Lab! Hosted by Amii’s Chief Scientific Advisory Dr. Richard S. Sutton, these 20-minute talks on technical topics are delivered by students, faculty and guests. The talks are a relaxed and informal way of hearing leaders in AI discuss future lines of research they may explore, with topics ranging from ideas starting to take root to fully-finished projects.

Week eleven of the Tea Time Talks features:

#### Kirby Banman: Regression nonstationarities as dynamical systems

Many supervised learning algorithms are designed to operate under i.i.d. sampling. When those algorithms are applied to problems with nonstationary sampling, they can misbehave -- which is not surprising if one takes time to understand the conditions under which an algorithm's behaviour is (or is not) guaranteed. Dynamical systems analysis offers us some tools to extend those guarantees to certain kinds of nonstationary sampling. This talk exemplifies these ideas in a simple setting: optimizing linear regression models with SGD+momentum under periodic simple nonstationarity.

#### Manan Tomar: Multi-step Greedy Reinforcement Learning Algorithms

Multi-step greedy policies have been extensively used in model-based reinforcement learning (RL), both when a model of the environment is available (for example, in the game of Go) and when it is learned. In this talk, Manan presents a paper he co-authored which explores the benefits of multi-step greedy policies in model-free RL, when employed using multi-step dynamic programming algorithms: $\kappa$-Policy Iteration ($\kappa$-PI) and $\kappa$-Value Iteration ($\kappa$-VI). These methods iteratively compute the next policy ($\kappa$-PI) and value function ($\kappa$-VI) by solving a surrogate decision problem with a shaped reward and a smaller discount factor. The authors derive model-free RL algorithms based on $\kappa$-PI and $\kappa$-VI in which the surrogate problem can be solved by any discrete or continuous action RL method, such as DQN and TRPO; and identify the importance of a hyper-parameter that controls the extent to which the surrogate problem is solved and suggest a way to set this parameter. When evaluated on a range of Atari and MuJoCo benchmark tasks, their results indicate that for the right range of $\kappa$, their algorithms outperform DQN and TRPO. This shows that their multi-step greedy algorithms are general enough to be applied over any existing RL algorithm and can significantly improve its performance.

#### Robin Chauhan: TalkRL and Other Projects

Robin shares some highlights and learnings from a year of interviewing RL researchers on the TalkRL podcast. Additionally, he dives deep into a Pommerman agent he designed.

#### Juan Fernando Hernandez Garcia: The Cascade-Correlation Learning Architecture: The Forgotten Network

In 1990, Scott E. Fahlman and Christian Lebiere proposed a constructive neural network architecture -- the cascade-correlation -- as an alternative to training deep neural networks with fixed architectures using backpropagation. Despite showing promising results and spurring several follow up papers, the cascade-correlation is not popular in the deep learning community. In this talk, Juan explores why the cascade-correlation is not popular anymore, in the process presenting several empirical results that demonstrate the performance of the cascade-correlation under several settings and in different domains. He discusses disadvantages of the cascade-correlation that have been found in the literature, but also several extensions that have been proposed to address each of them. He concludes by arguing why the cascade-correlation is worth caring about.

The Tea Time Talks have now concluded for the year, but stay tuned as we will be uploading the final talks next week. In the meantime, you can rewatch or catch up on previous talks on our Youtube playlist.

Looking to build AI capacity? Need a speaker at your event?

### Connect with the community

Get involved in Alberta's growing AI ecosystem! Speaker, sponsorship, and letter of support requests welcome.