Research

## AI Seminar Series 2020: Mark Schmidt on Faster Algorithms for Deep Learning

The Artificial Intelligence (AI) Seminar is a weekly meeting at the University of Alberta where researchers interested in AI can share their research. Presenters include both local speakers from the University of Alberta and visitors from other institutions. Topics related in any way to artificial intelligence, from foundational theoretical work to innovative applications of AI techniques to new fields and problems, are explored.

On May 22, 2020, Canada CIFAR AI Chair at Amii Mark Schmidt, also Associate Professor at the University of British Columbia, Canada Research Chair and Alfred P. Sloan Fellow, presented “Faster Algorithms for Deep Learning?”.

In his talk, Schmidt explains the SGD algorithm, a method which is popular for training deep learning models, but which works slowly due to the variance in the gradient approximation. Schmidt discusses several algorithms that may be implemented to speed up deep learning models, depending on whether the models are under- or over-parameterized.

Watch his full presentation below:

Research

## Amii researcher co-awarded $2.5 million to study literacy in a digital and multicultural world Amii Fellow and Canada CIFAR AI Chair Alona Fyshe has been named a co-recipient of a$2.5 million grant to study literacy in a digital and multicultural world. The project, Ensuring Full Literacy in a Multicultural and Digital World, is led by Janet Werker of the University of British Columbia with Dr. Fyshe acting as Co-Director. The multidisciplinary team of experts brings together researchers from the disciplines of psychology, computing science, linguistics, and anthropology, among others.

Over the next seven years, the project will study literacy across a variety of backdrops including language acquisition and development, bilingualism, differences in culture, and the emergence and use of new technologies such as reading on tablets or learning to read with an app.

“The majority of our current computer language models are trained from sources representing skilled language use that is comparable to a fully-fluent adult,” explains Dr. Fyshe, who is an Assistant Professor in the Faculty of Science at the University of Alberta. “But how should we train models to represent a person who is not fluent in a language, or who is acquiring their first language? My role in this grant is  to create computer models to explore how language is learned in early life and in certain instances where reading ability is hindered.”

The research has implications for the study of the brain and also for possible treatments in situations where language acquisition is not proceeding as normal. Additionally, by improving understanding of how people become skilled readers, the work may also inspire new methods for training machine learning algorithms for language tasks (the project has collaborators from software companies who make language learning apps)  and training machine learning models in a more general setting.

Dr. Fyshe and her team will work to create two new brain imaging datasets that will allow researchers to study representations of single-word meaning in infants and to contrast brain activity during reading between typical comprehenders and poor comprehenders. In this case, poor comprehenders are children who can read and understand single words but have trouble putting words together to understand sentences, paragraphs and beyond.

Researchers will then examine these brain imaging datasets and compare them to current computer models of language meaning to better understand the processes of language acquisition and development.

The funding, which is a Social Sciences and Humanities Research Council (SSHRC) of Canada Partnerships Grant, comes as part of the Government of Canada’s recent investment of 75 million in social sciences and humanities research, which has been awarded to more than 1,600 researchers from over 60 universities across Canada. “I’m absolutely over the moon to be joining this stellar team of researchers” says Dr. Fyshe. “We’ve brought together a truly cross-disciplinary team who are all so passionate about language learning and literacy. It was a truly great experience just writing the grant, and now I can’t wait to get started on this important work.” A full list of awardees can be found on the SSHRC website. ###### Learn how Amii advances world-leading artificial intelligence and machine learning research: visit our Research page. Research ## AI Seminar Series 2020: Scott Niekum on Scaling Probabilistically Safe Learning to Robotics The Artificial Intelligence (AI) Seminar is a weekly meeting at the University of Alberta where researchers interested in AI can share their research. Presenters include both local speakers from the University of Alberta and visitors from other institutions. Topics related in any way to artificial intelligence, from foundational theoretical work to innovative applications of AI techniques to new fields and problems, are explored. On April 24, 2020, Scott Niekum, Assistant Professor and Director of the Personal Autonomous Robotics Lab (PeARL) in the Department of Computer Science at UT Austin, presented Scaling Probabilistically Safe Learning to Robotics. Neikum’s talk touches on developments that could allow robots to learn through imitation, with an emphasis on ensuring that they act “safely”, which he describes as meeting or exceeding a measure of performance with a probabilistic guarantee of correctness. Watch his full presentation below: ###### Learn how Amii advances world-leading artificial intelligence and machine learning research: visit our Research page. Research ## AI Seminar Series 2020: Lili Mou on Search-Based Unsupervised Text Generation The Artificial Intelligence (AI) Seminar is a weekly meeting at the University of Alberta where researchers interested in AI can share their research. Presenters include local speakers from the University of Alberta and visitors from other institutions. Topics related in any way to artificial intelligence, from foundational theoretical work to innovative applications of AI techniques to new fields and problems, are explored. On April 17, 2020, Dr. Lili Mou — Amii Fellow, Assistant Professor holding the AltaML Professorship in Natural Language Processing at the University of Alberta, and Canada CIFAR AI Chair — presented Search-Based Unsupervised Text Generation. His talk features three papers he co-authored which have been accepted at the Annual Conference of the Association for Computational Linguistics (ACL 2020), two of which are currently available here: Unsupervised Paraphrasing by Simulated Annealing and Discrete Optimization for Unsupervised Sentence Summarization with Word-Level Extraction Weaving these papers together, Dr. Mou presents his work on the different applications of search-based unsupervised text generation including paraphrasing, summarization and text simplification. In short: he aims to build text generation agents without training data. Watch his full presentation below: ###### Learn how Amii advances world-leading artificial intelligence and machine learning research: visit our Research page. Research ## Postdoc Opportunity: Intelligent Robot Learning (IRL) Lab The Intelligent Robot Learning (IRL) Lab, affiliated with the University of Alberta and the Alberta Machine Intelligence Institute (Amii), has an immediate opening for a postdoctoral fellow to work with Matthew E. Taylor and multiple graduate students. The goal of this multi-year project is to improve the way humans and reinforcement learning agents interact. In many cases, a human teacher may be able to provide additional guidance to the student agent. The goal of this feedback is typically to improve the agent’s learning (relative to learning without human guidance) while not harming the agent’s eventual performance (ensuring sub-optimal guidance does not hinder asymptotic performance). In particular, the lab considers questions such as: – When and where is one type of human guidance more useful than another (e.g., a demonstration vs. providing positive and negative feedback)? – What types of guidance would a human prefer to provide, and how does this interact with their perceived cognitive load? – When should a computational agent ask for advice? – Which of these methods, if any, are appropriate when the student is a human and the teacher is an agent? The ideal candidate should have a PhD with demonstrated research experience in reinforcement learning, human-in-the-loop machine learning, explainability in machine learning, or human-robot interaction, with a strong record of publications. The candidate’s abilities in deep learning, python, human subject studies, and communication will also be considered. The initial appointment will be for one year, is renewable subject to availability of funding and satisfactory performance, and offers a competitive salary. The Computing Science department at the University of Alberta is an excellent and supportive environment of over 60 professors, 300 graduate students, and 1,300 undergraduate students. Matt’s research is situated inside of the larger Artificial Intelligence group with over 25 faculty members. Many are exploring issues in reinforcement learning, and others are interested in reinforcement learning interacting with humans. In addition, Matt is an Amii Fellow; Amii enriches the training environment by supporting dozens of students working in related areas, bringing high-profile visitors to the department, and supporting regular seminars for researchers to present their work. UAlberta and IRL lab strives to provide equitable access to opportunities for post-doctoral fellows to disciplines of study and scholarly opportunities, while understanding this may require applying equitable effort; to study and disseminate knowledge about equity, diversity, and inclusivity; and to provide an environment attentive to, and that addresses, barriers to inclusion, access, and success (especially of historically excluded groups). Applicants from traditionally underrepresented groups are particularly encouraged to apply to this position. This position will provide the same family leave as is guaranteed to PhD students at the University. To find out more or to apply, please send an email to Matt at mtaylor3+PDF2020@ualberta.ca with the subject “PDF2020”. If applying, please include a current CV and contact information for at least two references (cover letter optional). View the full job posting here. Research ## Tracking Mental Health During the Coronavirus Pandemic Alona Fyshe (Amii Fellow and Canada CIFAR AI Chair) has teamed up with Rumi Chunara (New York University), and Daniel Lizotte and Brent Davis (Western University) to leverage machine learning and social media to better understand the drivers of mental health during a pandemic. The team is working to develop AI techniques for social media data to understand emerging challenges that affect people during the pandemic and how these challenges impact mental health. The project was approved under the CIFAR AI Catalyst Grants Program, which is intended to address the COVID-19 pandemic and catalyze new research areas and collaborations in machine learning, providing funding for innovative, high-risk, high-reward ideas and projects. Over the first months of the pandemic, people have turned to social media in large numbers (Twitter recently reported that active users are up 23%). This expanding social media discourse provides a view into the experiences of people that is not available by other means, especially for marginalized groups, including people with limited access to health care, those with limited socioeconomic means, and undocumented immigrants. Researchers will use this wealth of social media data to understand mental health’s acute and long term drivers, as well as mitigating factors. “AI has a key role to play in supporting and enabling the important work of public health experts,” says Dr. Fyshe, who is a professor at the University of Alberta, cross-appointed in the Departments of Computing Science and Psychology. “It’s crucial that we create integrated, multidisciplinary teams as we work to meet challenges presented by the COVID-19 pandemic. AI researchers have a lot to offer, but we should take our cues from epidemiologists and other healthcare specialists to ensure our solutions achieve maximum impact.” The multidisciplinary project will leverage Dr. Fyshe’s expertise in Natural Language Processing. She is joined by Dr. Chunara with expertise in machine learning, public health and social media analytics and Dr. Lizotte and trainee Brent Davis, who developed a social media analytics framework for public health practitioners. Using data from Twitter and Reddit, the research team will extract time-varying latent linguistic factors – meaning changes in the topics people discuss over time – that track with either the rise of the pandemic or changes in established measures of mental health from text (Linguistic Inquiry and Word Count, for example). These factors will be associated with user groups that are defined by geotags, hashtags, or subreddit activity. Interpretability of results will be paramount to support public health practitioners and policymakers as they develop and implement tailored mental health supports for different geographic and social groups. The team will create an online visual analytics system freely available to practitioners that will help them adapt to the changing experiences of the populations they serve. They will also maintain a blog to rapidly disseminate results and retrospectives. The team will work with public health partners throughout the year-long project to ensure the resulting system aligns to the needs of practitioners and the public. ###### This research is based on work supported by the CIFAR AI and COVID Catalyst Grants. Learn more about Amii’s other projects under the CIFAR AI and Catalyst Grants at amii.ca/cifar-catalyst-grants-coronavirus/‎ Research ## CIFAR AI and COVID Catalyst Grants: Managing the Pandemic with AI Five Amii Fellows were awarded research grants as part of CIFAR’s AI and COVID Catalyst Grants initiative. The CIFAR AI Catalyst Grants Program is intended to catalyze new research areas and collaborations in machine learning, providing funding for innovative, high-risk, high-reward ideas and projects. Amii researchers will collaborate across four projects, each in a distinct area of pandemic management: enabling drug discovery, creating a virtual data lab, detecting and monitoring illness, and tracking mental health. Learn more about the projects below. You can also find out more about how Amii is lending our expertise in the global fight against COVID-19 at amii.ca/covid-19 ###### Tracking Mental Health During the Coronavirus Pandemic Alona Fyshe, Amii Fellow and Canada CIFAR AI Chair, has joined with a team of researchers from Western University and NYU in order to leverage machine learning and social media to better understand the drivers of mental health, both acute and long-term, as well as mitigating factors. With a recent uptick in social media use, the expanding discourse provides a view into the experiences of people that is not available by other means, especially for marginalized groups (e.g. people with limited access to health care, lower socioeconomic status, and undocumented immigrants). With findings disseminated through an online visual analytics and rapid reporting system, the team will develop AI techniques for social media data to understand: 1) emerging challenges that affect people during the pandemic; and 2) how these challenges impact mental health. Collaborators include Alona Fyshe (Canada CIFAR AI Chair, CIFAR Learning in Machines & Brains program, Amii, University of Alberta), Daniel Lizotte (Western University), Rumi Chunara, Brent Davis. Learn more: amii.ca/mental-health-coronavirus ###### Accelerating Small Molecule Drug Discovery Amii Fellow Matthew E. Taylor joins a collaborative team of researchers from Mila and 99andBeyond (a company that develops technological platforms for drug discovery) to accelerate the discovery of safe and effective small-molecule treatments against COVID-19 and to mitigate other future outbreaks. The team will apply recent advancements in machine learning to repurpose small molecules with proven safety (phase I trials) and identify novel candidates as anti-COVID-19 therapeutics. The project seeks to advance the field of reinforcement learning in small molecule drug discovery, retain and monetize made-in-Canada IP, and most crucially, openly publish anti-COVID-19 candidates associated with promising in-vitro data for the benefit of the research community. Collaborators include Sarath Chandar (Canada CIFAR AI Chair, Mila, Polytechnique Montréal), Matthew E. Taylor (Amii, University of Alberta), Sai Krishna (99andBeyond), Karam Thomas (99andBeyond). ###### Guarding At-Risk Demographics with AI (GuARD-AI) GuARD-AI brings together Amii Fellows Randy Goebel and Martha White (also a Canada CIFAR AI Chair) with health informatics researchers, health information leaders and data ethics researchers to prototype a virtual data laboratory. The project aims to identify at-risk populations, predict disease course at an individual level, predict disease spread across an entire health system, and analyze insights gained to refine future use of virtual healthcare delivery models in crisis scenarios. The project is working to develop best practices for health analytics in situations where time is of the essence and action-based decisions can be supported by extracting value from highly dynamic and time-sensitive data. GuARD-AI will help with immediate pandemic-related challenges facing Alberta and Canada and contribute to future adaptable systems for reacting to time-sensitive outbreaks. Collaborators include Daniel C. Baumgart (University of Alberta), Geoffrey Rockwell (University of Alberta), Martha White (Canada CIFAR AI Chair, University of Alberta, Amii), Randy Goebel (Amii, University of Alberta), Robert Hayward (Chief Medical Information Officer, Alberta Health Services), Shy Amlani, (Virtual Health), Jonathan Choy (Virtual Health), Sara Webster (Virtual Health), and Sarah Hall (Virtual Health). ###### Detecting and Monitoring Pneumonia in COVID-19 Patients Severe illness and death in COVID-19 patients is most often due to progression of the disease to an interstitial pneumonia resembling acute respiratory distress syndrome (ARDS), which requires hospitalization. Amii Fellow Russ Greiner joined with MEDO.ai, a machine learning diagnostics company, and health experts from New York state to produce a diagnostic tool through applying machine learning to ultrasound scans to automatically determine which patients have pneumonia. Ultrasound, rather than computerized tomography (CT), is proposed due to the portability and of the technology (reducing transmission risk) and because it does not carry the same risk of exposure to radiation. Detecting changes to a patient’s condition before the patient requires emergent intervention may allow for the provision of supportive care earlier and more effectively. The final system, which researchers anticipate will outperform the average human reader (and potentially even exceed the performance of experts) will be integrated into ultrasound scanners to produce a tool that can be used effectively by a healthcare worker – even one with limited training. Collaborators include Kumaradevan Punithakumar (University of Alberta), Russell Greiner (University of Alberta, Amii), Jacob Jaremko (University of Alberta), Nathaniel Meuser-Herr (Upstate Health Care Center, NY), Dornoosh Zonoobi (MEDO.ai). ###### This research is based on work supported by the CIFAR AI and COVID Catalyst Grants. Amii is lending our expertise to high-impact initiatives in the global fight against COVID-19 – learn more at amii.ca/covid-19 Research ## Amii at AAMAS 2020: Accepted Papers Amii is proud to feature the work of our researchers at the 19th International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS). Amii supports cutting-edge research by leveraging scientific advancement into industry adoption, enabling our world-leading researchers to focus on solving tough problems while our teams translate knowledge, talent and technology – creating an integrated system that allows both research and industry to thrive. Such cutting-edge research is currently being featured at AAMAS, running online this year from May 9 to 13. AAMAS is a globally-renowned scientific conference for research in autonomous agents and multi-agent systems. Agents, entities that can interact with their environment or other agents, are an increasingly important field of artificial intelligence. Agents can learn, reason about others, adopt norms, and interact with humans in both virtual and physical settings,” explains Matthew E. Taylor, Amii Fellow at the University of Alberta, in a recent blog post. “This field includes contributions to many areas across artificial intelligence, including game theory, machine learning, robotics, human-agent interaction, modeling, and social choice.” Accepted papers from Amii researchers cover a range of topics including: the interaction of online neural network training and interference in reinforcement learning; the introduction of deep anticipatory networks, which enable an agent to take actions to reduce its uncertainty without performing explicit belief inference; and multi agent deep reinforcement learning. Learn more below about how Amii Fellows and researchers – professors and graduate students at the University of Alberta – are contributing to this years’ proceedings: Solving Zero-Sum Imperfect Information Games Using Alternative Link Functions: An Analysis off$-Regression Counterfactual Regret Minimization Dustin Morrill and Ryan D’Orazio (Amii researchers), James Wright and Michael Bowling (Amii Fellows) Abstract: Function approximation is a powerful approach for structuring large decision problems that has facilitated great achievements in the areas of reinforcement learning and game playing. Regression counterfactual regret minimization (RCFR) is a flexible and simple algorithm for approximately solving imperfect information games with policies parameterized by a normalized rectified linear unit (ReLU). In contrast, the more conventional softmax parameterization is standard in the field of reinforcement learning and has a regret bound with a better dependence on the number of actions in the tabular case. We derive approximation error-aware regret bounds for$(\Phi, f)$-regret matching, which applies to a general class of link functions and regret objectives. These bounds recover a tighter bound for RCFR and provides a theoretical justification for RCFR implementations with alternative policy parameterizations ($f$-RCFR), including softmax. We provide exploitability bounds for$f\$-RCFR with the polynomial and exponential link functions in zero-sum imperfect information games, and examine empirically how the link function interacts with the severity of the approximation to determine exploitability performance in practice. Although a ReLU parameterized policy is typically the best choice, a softmax parameterization can perform as well or better in settings that require aggressive approximation.

Multi Type Mean Field Reinforcement Learning

Sriram Ganapathi Subramanian, Pascal Poupart, Matthew E. Taylor and Nidhi Hegde (Amii Fellows)

Abstract: Mean field theory provides an effective way of scaling multiagent reinforcement learning algorithms to environments with many agents that can be abstracted by a virtual mean agent. In this paper, we extend mean field multiagent algorithms to multiple types. The types enable the relaxation of a core assumption in mean field games, which is that all agents in the environment are playing almost similar strategies and have the same goal. We conduct experiments on three different testbeds for the field of many agent reinforcement learning, based on the standard MAgents framework. We consider two different kinds of mean field games: a) Games where agents belong to predefined types that are known a priori and b) Games where the type of each agent is unknown and therefore must be learned based on observations. We introduce new algorithms for each type of game and demonstrate their superior performance over state of the art algorithms that assume that all agents belong to the same type and other baseline algorithms in the MAgent framework.

Maximizing Information Gain via Prediction Rewards

Yash Satsangi, Sungsu Lim (Amii researcher), Shimon Whiteson, Frans Oliehoek, Martha White (Amii Fellow)

Abstract: Information gathering in a partially observable environment can be formulated as a reinforcement learning (RL), problem where the reward depends on the agent’s uncertainty. For example, the reward can be the negative entropy of the agent’s belief over an unknown (or hidden) variable. Typically, the rewards of an RL agent are defined as a function of the state-action pairs and not as a function of the belief of the agent; this hinders the direct application of deep RL methods for such tasks. This paper tackles the challenge of using belief-based rewards for a deep RL agent, by offering a simple insight that maximizing any convex function of the belief of the agent can be approximated by instead maximizing a prediction reward: a reward based on prediction accuracy. In particular, we derive the exact error between negative entropy and the expected prediction reward. This insight provides theoretical motivation for several fields using prediction rewards—namely visual attention, question answering systems, and intrinsic motivation—and highlights their connection to the usually distinct fields of active perception, active sensing, and sensor placement. Based on this insight we present deep anticipatory networks (DANs), which enables an agent to take actions to reduce its uncertainty without performing explicit belief inference. We present two applications of DANs: building a sensor selection system for tracking people in a shopping mall and learning discrete models of attention on fashion MNIST and MNIST digit classification.

Improving Performance in Reinforcement Learning by Breaking Generalization in Neural Networks

Sina Ghiassian and Banafsheh Rafiee (Amii researchers), Yat Long Lo (visitor), Adam White (Amii Fellow)

Abstract: Reinforcement learning systems require good representations to work well. For decades practical success in reinforcement learning was limited to small domains. Deep reinforcement learning systems, on the other hand, are scalable, not dependent on domain specific prior knowledge and have been successfully used to play Atari, in 3D navigation from pixels, and to control high degree of freedom robots. Unfortunately, the performance of deep reinforcement learning systems is sensitive to hyper-parameter settings and architecture choices. Even well tuned systems exhibit significant instability both within a trial and across experiment replications. In practice, significant expertise and trial and error are usually required to achieve good performance. One potential source of the problem is known as catastrophic interference: when later training decreases performance by overriding previous learning. Interestingly, the powerful generalization that makes Neural Networks (NN) so effective in batch supervised learning might explain the challenges when applying them in reinforcement learning tasks. In this paper, we explore how online NN training and interference interact in reinforcement learning. We find that simply re-mapping the input observations to a high-dimensional space improves learning speed and parameter sensitivity. We also show this preprocessing reduces interference in prediction tasks. More practically, we provide a simple approach to NN training that is easy to implement, and requires little additional computation. We demonstrate that our approach improves performance in both prediction and control with an extensive batch of experiments in classic control domains.

One Extended Abstract co-authored by an Amii Fellow has also been accepted for publication on the JAAMAS Track:

A Very Condensed Survey and Critique of Multiagent Deep Reinforcement Learning

Pablo Hernandez-Leal, Bilal Kartal, Matthew E. Taylor (Amii Fellow)

Research

## Amii Researchers dig deep at renowned international conference ICLR 2020

Amii supports cutting-edge research by leveraging scientific advancement into industry adoption, enabling our world-leading researchers to focus on solving tough problems while our teams translate knowledge, talent and technology – creating an integrated system that allows both research and industry to thrive.

Such cutting-edge research is currently being featured at the Eighth International Conference on Learning Representations (ICLR), running online this year from April 26 to May 1. ICLR is the premier gathering of professionals dedicated to advancing the branch of AI called representation learning, also referred to as deep learning. The conference is globally renowned for presenting and publishing cutting-edge research on all aspects of deep learning used in the fields of AI, statistics and data science, as well as important application areas such as machine vision, computational biology, speech recognition, text understanding, gaming, and robotics.

Accepted papers from Amii researchers cover a range of topics including the reduction of overestimation bias in Q-learning, training RNNs more effectively by reformulating the training objective, and the reduction of selection bias when estimating treatment effects from observational data.

Learn more below about how Amii Fellows and researchers – professors and students at the University of Alberta – are contributing to this years’ proceedings.

###### Several papers co-authored by Amii Fellows and students have been accepted for publication by ICLR in 2020:

Maxmin Q-learning: Controlling the Estimation Bias of Q-learning

Qingfeng Lan and Yangchen Pan (Amii students), Alona Fyshe and Martha White (Amii Fellows)

Abstract: Q-learning suffers from overestimation bias, because it approximates the maximum action value using the maximum estimated action value. Algorithms have been proposed to reduce overestimation bias, but we lack an understanding of how bias interacts with performance, and the extent to which existing algorithms mitigate bias. In this paper, we 1) highlight that the effect of overestimation bias on learning efficiency is environment-dependent; 2) propose a generalization of Q-learning, called \emph{Maxmin Q-learning}, which provides a parameter to flexibly control bias; 3) show theoretically that there exists a parameter choice for Maxmin Q-learning that leads to unbiased estimation with a lower approximation variance than Q-learning; and 4) prove the convergence of our algorithm in the tabular case, as well as convergence of several previous Q-learning variants, using a novel Generalized Q-learning framework. We empirically verify that our algorithm better controls estimation bias in toy environments, and that it achieves superior performance on several benchmark problems.

Learning Disentangled Representations for CounterFactual Regression

Negar Hassanpour (Amii student), Russell Greiner (Amii Fellow)

Abstract: We consider the challenge of estimating treatment effects from observational data; and point out that, in general, only some factors based on the observed covariates X contribute to selection of the treatment T, and only some to determining the outcomes Y. We model this by considering three underlying sources of {X, T, Y} and show that explicitly modeling these sources offers great insight to guide designing models that better handle selection bias. This paper is an attempt to conceptualize this line of thought and provide a path to explore it further.

In this work, we propose an algorithm to (1) identify disentangled representations of the above-mentioned underlying factors from any given observational dataset D and (2) leverage this knowledge to reduce, as well as account for, the negative impact of selection bias on estimating the treatment effects from D. Our empirical results show that the proposed method achieves state-of-the-art performance in both individual and population based evaluation measures.

Progressive Memory Banks for Incremental Domain Adaptation

Nabiha Asghar, Lili Mou (Amii Fellow), Kira A. Selby, Kevin D. Pantasdo, Pascal Poupart, Xin Jiang

Abstract: This paper addresses the problem of incremental domain adaptation (IDA) in natural language processing (NLP). We assume each domain comes one after another, and that we could only access data in the current domain. The goal of IDA is to build a unified model performing well on all the domains that we have encountered. We adopt the recurrent neural network (RNN) widely used in NLP, but augment it with a directly parameterized memory bank, which is retrieved by an attention mechanism at each step of RNN transition. The memory bank provides a natural way of IDA: when adapting our model to a new domain, we progressively add new slots to the memory bank, which increases the number of parameters, and thus the model capacity. We learn the new memory slots and fine-tune existing parameters by back-propagation. Experimental results show that our approach achieves significantly better performance than fine-tuning alone. Compared with expanding hidden states, our approach is more robust for old domains, shown by both empirical and theoretical results. Our model also outperforms previous work of IDA including elastic weight consolidation and progressive neural networks in the experiments.

Training Recurrent Neural Networks Online by Learning Explicit State Variables

Somjit Nath (Amii alum), Vincent Liu, Alan Chan, Xin Li (Amii students), Adam White and Martha White (Amii Fellows)

Abstract: Recurrent neural networks (RNNs) allow an agent to construct a state-representation from a stream of experience, which is essential in partially observable problems. However, there are two primary issues one must overcome when training an RNN: the sensitivity of the learning algorithm’s performance to truncation length and and long training times. There are variety of strategies to improve training in RNNs, the mostly notably Backprop Through Time (BPTT) and by Real-Time Recurrent Learning. These strategies, however, are typically computationally expensive and focus computation on computing gradients back in time. In this work, we reformulate the RNN training objective to explicitly learn state vectors; this breaks the dependence across time and so avoids the need to estimate gradients far back in time. We show that for a fixed buffer of data, our algorithm—called Fixed Point Propagation (FPP)—is sound: it converges to a stationary point of the new objective. We investigate the empirical performance of our online FPP algorithm, particularly in terms of computation compared to truncated BPTT with varying truncation levels.

Frequency-based Search-control in Dyna

Yangchen Pan, Jincheng Mei (Amii students) and Amir-massoud Farahmand (Amii alum)

Abstract: Model-based reinforcement learning has been empirically demonstrated as a successful strategy to improve sample efficiency. In particular, Dyna is an elegant model-based architecture integrating learning and planning that provides huge flexibility of using a model. One of the most important components in Dyna is called search-control, which refers to the process of generating state or state-action pairs from which we query the model to acquire simulated experiences. Search-control is critical in improving learning efficiency. In this work, we propose a simple and novel search-control strategy by searching high frequency regions of the value function. Our main intuition is built on Shannon sampling theorem from signal processing, which indicates that a high frequency signal requires more samples to reconstruct. We empirically show that a high frequency function is more difficult to approximate. This suggests a search-control strategy: we should use states from high frequency regions of the value function to query the model to acquire more samples. We develop a simple strategy to locally measure the frequency of a function by gradient and hessian norms, and provide theoretical justification for this approach. We then apply our strategy to search-control in Dyna, and conduct experiments to show its property and effectiveness on benchmark domains.

###### In addition, Amii is also organizing three socials throughout the conference:

Amii Chief Scientific Advisor Dr. Richard Sutton will host a session on what he calls The Bitter Lesson of AI research, that “general methods that leverage computation are ultimately the most effective, and by a large margin” and “[t]he eventual success is tinged with bitterness, and often incompletely digested, because it is success over a favored, human-centric approach.”

The RL Mixer brings together researchers interested in reinforcement learning for a sequence of randomly formed small group discussions. Participants will get opportunities to discuss a wide variety of topics with new people through Zoom breakout rooms, with 30 minutes per group discussion.

The Amii Fellows Meet & Greet is a chance to meet and engage with Amii Fellows in conversations relevant to their research areas and experience.

Research

## RL Theory Seminars: bringing theory to the forefront

Amii Fellow and Canada CIFAR AI Chair Csaba Szepesvári has teamed up with Gergely Neu and Ciara Pike-Burke, researchers at the Pompeu Fabra University (Barcelona), to launch the RL Theory Seminar series. This series of online talks is geared at increasing knowledge-sharing in the rapidly growing field of reinforcement learning.

In recent years, reinforcement learning (RL) has grown to be one of the most active research areas in machine learning. Recent empirical successes have triggered a new wave of theoretical research in RL, and with so many new directions opening, it has become challenging to keep up on new developments. In response to this challenge, Szepesvári and colleagues launched the RL Theory Seminars.

“As someone who is keen on advancing the field, I’m thrilled to see the growing excitement around reinforcement learning,” says Szepesvári. “AI researchers around the world are taking notice of what we in the RL community have known all along – that these theories will be essential if we hope to tackle a large number of practical applications, ranging from problems in artificial intelligence to operations research or control engineering. RL theory provides the foundations we need to think through these problems.”

Amii CEO Cam Linke, also an RL researcher, recently echoed Szepesvári’s sentiments: “Reinforcement learning is the new wave of AI. More and more, we’re seeing companies exploring applications of reinforcement learning for process control and autonomous decision making. We’ve only just begun to scratch the surface of the value that can be created, and business leaders are starting to take notice.”

The popularity of the field is plain to those paying attention. Recently, Amii Chief Scientific Advisor Richard S. Sutton (also an Amii Fellow and Distinguished Research Scientist with DeepMind), released the second edition of his foundational textbook Reinforcement Learning: An Introduction. Szepesvári has also seen success with his course and online book Bandit Algorithms alongside Tor Lattimore.

Additionally, last year Amii and the University of Alberta launched the Reinforcement Learning Specialization on Coursera, taught by Martha White and Adam White, who are both Amii Fellows and Canada CIFAR AI Chairs in addition to being lab mates of Sutton and Szepesvári in UAlberta’s RLAI Lab.

The RL Theory Seminars, which are planned to run every Tuesday at 5:00 PM UTC, help researchers, practitioners and enthusiasts keep up with the pace of progress the field has seen over the past years. Each 50-minute talk (plus questions) will focus on the latest advances in theoretical reinforcement learning, taught by some of the world’s leading experts in the field.

At a time when in-person interactions are limited, the seminars provide a platform for researchers to get together and discuss the latest work in RL theory. Organizers aim to provide a balanced view of contemporary RL theory and invite speakers covering a broad range of topics. Seminars begin on April 28 and are currently scheduled into mid-June.

Research

## Recommended Reading: Computers Already Learn From Us. But Can They Teach Themselves?

The world has achieved significant successes by applying supervised learning algorithms to business problems. With these systems often requiring large investments to incorporate human knowledge, researchers and practitioners are increasingly turning their attention to self-learning algorithms.

Richard S. Sutton, Chief Scientific Advisor at Amii, recently took the opportunity to sit down with Craig S. Smith of the New York Times as part of their Artificial Intelligence special report. In the interview, Sutton highlights the key value of reinforcement learning in enabling the creation of AI systems that learn and act autonomously. From the article:

“Reinforcement learning in computer science, pioneered by Richard Sutton, now at the University of Alberta in Canada, is modeled after reward-driven learning in the brain: Think of a rat learning to push a lever to receive a pellet of food. The strategy has been developed to teach computer systems to take actions.

Set a goal, and a reinforcement learning system will work toward that goal through trial and error until it is consistently receiving a reward. Humans do this all the time. ‘Reinforcement is an obvious idea if you study psychology,’ Dr. Sutton said.

A more inclusive term for the future of A.I., he said, is ‘predictive learning,’ meaning systems that not only recognize patterns but also predict outcomes and choose a course of action. ‘Everybody agrees we need predictive learning, but we disagree about how to get there,’ Dr. Sutton said. ‘Some people think we get there with extensions of supervised learning ideas; others think we get there with extensions of reinforcement learning ideas.’”

Cam Linke, CEO of Amii and an AI researcher in his own right, echoes Sutton’s sentiments on the increasing importance of reinforcement learning.

“Reinforcement learning is the new wave of AI,” says Linke, whose own reinforcement learning research focuses on AI adapting behaviours to improve self-learning. “More and more, we’re seeing companies exploring applications of reinforcement learning for process control and autonomous decision making. We’ve only just begun to scratch the surface of the value that can be created, and business leaders are starting to take notice.”

Learn more about reinforcement learning in the Reinforcement Learning Specialization offered by the University of Alberta and Amii on Coursera, developed and delivered Martha White and Adam White, both former students of Sutton’s as well as Fellows and Canada CIFAR AI Chairs with Amii.

Research

## Meet Ana, the AI companion counteracting elder loneliness

### Meet Ana, the AI companion counteracting elder loneliness

Osmar Zaïane, Amii Fellow at the University of Alberta, is leading a team to improve elder care using machine intelligence.

Canada is facing an aging population. It’s expected that by 2030, 23% of Canadians will be seniors. Additionally, according to Statistics Canada, as many as 1.4 million seniors report feeling lonely. These stats indicate a looming impact on individuals, families and society at large, as feelings of loneliness are associated with higher levels of mental and physical health problems.

Enter the Automated Nursing Agent or Ana, a conversational software agent (i.e. chatbot) designed to converse with elderly individuals living at home. The model, which was explored in a recently published study, aims to ease loneliness in seniors using emotionally intelligent conversations.

“When an elderly person tells you something that’s sad, it’s important to respond with empathy,” said Zaïane in an interview with Folio. “That requires that the device first understand the emotion that is expressed. We can do that by converting the speech to text and looking at the words that are used. In this study, we looked at the next step: having the program express emotions—like surprise, sadness, happiness—in its response.”

Working as both a personal assistant and a digital companion, Ana will build a knowledge base of personalized facts and memories (such as important people, places, activities and prescriptions), carry on engaging conversations that express and respond to emotions, and also answer impersonal questions from sources on the Internet. This will give Ana the ability to not only fulfill social needs, but also assist with simple home healthcare needs such as prescription reminders.

The team is currently working on improving their limited prototype.

This story was featured in the AICan Bulletin. Subscribe to the bi-monthly email publication to keep up to date on AI in Canada.

### Voici Ana, l’agent intelligent qui brise la solitude des aînés

Osmar Zaïane, boursier de l’Amii à l’Université de l’Alberta, dirige une équipe qui tente d’améliorer les soins aux personnes âgées grâce à l’intelligence artificielle.

La population du Canada vieillit. D’ici 2030, on estime que 23 % de la population canadienne sera constituée d’aînés. De plus, selon Statistique Canada, 1,4 million d’aînés ont déclaré éprouver de la solitude. Ces données annoncent des répercussions imminentes sur les individus, les familles et la société en général, car le sentiment de solitude est souvent associé à des problèmes plus importants de santé mentale et physique.

C’est ici qu’entre en scène un agent infirmier automatisé nommé Ana (pour Automated Nursing Agent). Cet agent conversationnel a été conçu pour s’entretenir avec des personnes âgées qui vivent à la maison. Le modèle, qui a été évalué dans une étude publiée récemment, vise à briser la solitude des aînés au moyen de conversations fondées sur la détection et l’expression des émotions.

« Quand une personne âgée vous dit qu’une chose est triste, il est important de démontrer de l’empathie », indique Osmar Zaïane dans une entrevue pour Folio, l’infolettre de l’Université de l’Alberta. « Cela exige que l’appareil comprenne d’abord l’émotion exprimée. Nous pouvons y arriver en convertissant le discours en texte et en examinant les mots utilisés. Dans notre recherche, nous sommes passés à l’étape suivante : faire en sorte que le programme réponde en exprimant des émotions, comme la surprise, la tristesse ou le bonheur. »

À la fois assistant personnel et compagnon numérique, Ana créera une base de connaissances constituée de faits et de souvenirs personnalisés (tels que des personnes importantes, des lieux, des activités et des ordonnances), poursuivra des conversations engageantes qui détectent et expriment des émotions, et répondra aux questions plus générales en accédant à des sources sur Internet. Ana pourra ainsi répondre aux besoins sociaux d’une personne, mais aussi l’aider sur le plan des soins à domicile, comme les rappels concernant les médicaments.

L’équipe travaille actuellement à améliorer son prototype.

Cet article a été publié dans le Bulletin IACan. Abonnez-vous à la publication électronique bimestrielle pour rester au fait des plus récentes nouvelles en IA au Canada.

Research

## Greiner Lab seeking Postdoctoral Fellows

The Greiner Lab, within the Alberta Machine Intelligence Institute (Amii) at the University of Alberta, is seeking strong researchers to hire as Postdoctoral Fellows.

Postdoctoral Fellows will help the Greiner Lab (along with medical researchers and clinicians) address a number of interesting and important medical-informatics tasks (including learning models for managing, screening, diagnosis, and/or prognosis) related to:

using technologies that include:

• Metabolic profiles
• Microarray / NGS / SNP / CNV, etc.
• Clinical data
• Images, scans, etc.

as well as foundational topics such as:

The ideal candidate has:

• A PhD in Computer Science or a closely related field
• A research record related to Machine Learning / Artificial Intelligence (eg, first-author papers at ICML, NeurIPS, UAI, AAAI, IJCAI)
• Experience working on medical projects (eg, papers in medical/biological journals)

To apply, please email the following to rgreiner@ualberta.ca and use ‘PDF Medical Informatics 2019‘ as the subject of your email:

• A cover letter, specifying which projects most interest you and indicating why you feel that you qualify (optionally, summarize how you would work on each such project)
• Your CV, including a description of any previous research or industrial jobs you have held
• Names and email addresses of at least 2 references

Research

Principal Investigators
Patrick M. Pilarski, Richard S. Sutton

###### Intelligent Artificial Limbs & Biomedical Devices

The Adaptive Prosthetics Program, a collaboration between Amii and the BLINC Lab, is an interdisciplinary initiative focused on real-time machine learning methods for assistive rehabilitation and intelligent artificial limbs. Through the development of fundamental algorithms and the translation of methodology into clinical benefit, the program seeks to increase patients’ ability to customize and control assistive biomedical devices.

The Adaptive Prosthetics Program explores fundamental and applied methods for real-time prediction, adaptive control and direct human-machine interaction. Technologies developed through the program include the Bento Arm and the HANDi Hand, both of which have open-sourced hardware and software through the BLINCdev community.

Development of the Bento Arm: an Improved Robotic Arm for Myoelectric Training and Research” published at MEC ’14: Myoelectric Controls Symposium

Development of the HANDi Hand: an Inexpensive, Multi-Articulating, Sensorized Hand for Machine Learning Research in Myoelectric Control” published at MEC ’17: Myoelectric Controls Symposium

Research

Principal Investigator
Michael Bowling

###### Atari 2600 Platform for General Artificial Intelligence Development

The Arcade Learning Environment (ALE) is a software framework designed to facilitate the development and testing of general AI agents. ALE was created as both a challenge problem for AI researchers and a method for developing and evaluating innovations and technologies.

Historically, many AI advancements have been developed and tested through games (e.g. CheckersChessPoker and, most recently, Go), which offer controlled, well-understood environments with easily defined measures for success. Games also give researchers a concrete and relatable way to demonstrate artificial intelligence to a broad audience.

The Arcade Learning Environment, powered by the Stella Atari Emulator, provides an interface to hundreds of Atari 2600 games. These diverse game environments are complex enough to be challenging for an AI agent to solve yet simple enough to enable progress.

ALE is available to researchers and hobbyists alike with Atari now being used by groups like DeepMind to develop and test their deep reinforcement learning methodologies.

The Arcade Learning Environment: An Evaluation Platform for General Agents” published in the Journal of Artificial Intelligence Research

Research

## Meerkat

Principal Investigators
Randy Goebel, Osmar Zaïane

###### Social Network Analysis & Visualization

Meerkat is an automated Social Network Analysis (SNA) tool used to analyze, visualize and interpret large or complex networks of information, allowing users to examine patterns and investigate relational dynamics.

The application uses information about the interactions between a set of objects (or nodes) within a network and lets the user employ different algorithms to automatically identify meaningful connections or highlight the most influential or central nodes in different ways.

Network analysis features include:

• Automated community detection and analysis
• Interactive visualization using general, community and metric-based layouts
• Filtration and extraction of useful data
• Dynamic analysis of network changes over time

Meerkat also provides tools for text mining, including polarity and emotion analysis, which give users the ability to examine text for positive and negative sentiments and a range of basic emotions. Meerkat ED, a version of the program that has been tailored specifically for educational environments, allows instructors to evaluate student activities in online discussion forums.

Research

## DeepStack

Principal Investigator
Michael Bowling

###### Problem we’re trying to solve

For several years, AI researchers have had a number of different techniques for predicting and planning optimal actions in situations of perfect information (where all actors have the same, full knowledge of the world). Techniques have been lacking for dealing with imperfect information situations (where actors do not have access to certain information or have access to information the other doesn’t). DeepStack seeks to successfully apply, for the first time, theoretical techniques for perfect information games into situations with imperfect information.

###### How will this help someone / an industry?

For computing scientists and AI researchers, DeepStack represents a foundational step forward in dealing with issues around predicting optimal actions in the face of ambiguity and uncertainty. The theoretical advancements demonstrated in DeepStack will open new avenues of research for scientists interested in building, and planning with, models of unknown, complex dynamic systems.

###### Type of MI used

Reinforcement learning, Deep learning

Research

## Diagnosing Tuberculosis

Principal Investigator:
Yutaka Yasui

###### Problem we’re trying to solve

Inexpensive, timely and accurate diagnosis of potential cases of Tuberculosis is of critical importance in regions of the world where resources are limited. Standard methods of diagnosis are often too expensive or resource intensive to be deployed in the very regions where tuberculosis is a significant problem. This leads to poor patient outcomes due to delayed treatment and undiagnosed illness.

###### How will this help someone / an industry?

The machine learning component of this work developed a new automated diagnostic method using image analysis which enables diagnosis that is more efficient and lower cost than standard methods, and because it is automated, poses lower biohazard risk to the technicians processing samples.

###### Partners

TV/HIV Research Foundation (Thailand)

Research

## FMRI-based Diagnosis & Treatment

Principal Investigator:
Russ Greiner

###### Problem we’re trying to solve

Current methods of diagnosing neurological and psychological disorders often rely on the subjective assessment of a patient’s symptoms by a clinical psychiatrist. These assessments can differ between psychiatrists, leading to different recommendations for a treatment plan. We aim to provide these psychologists with tools that provide objective criteria for diagnosis and the assessment of symptom severity in order to provide psychiatrists data-driven methodologies for assessing patients.

###### How will this help someone / an industry?

Our goal with this project is to use machine learning techniques to produce clinical tools that could assist medical doctors in providing faster, more effective treatment for neurological and psychiatric illnesses. We are exploring the use of brain imaging to diagnose mental disorders earlier and more accurately, to predict symptoms and their severity, and to predict which combination of drug therapies will work best for a given patient.

IBM Research

Research

## Patient-Specific Survival Prediction

Principal Investigator:
Russ Greiner

###### Problem we’re trying to solve

Prognostic modeling is an integral component in the treatment and management of patients. Currently being developed for the field of oncology, PSSP predicts individual survival distributions for patients from their electronic health record, significantly reducing the prediction error compared to the standard approach of using only the cancer site and stage.

###### How will this help someone / an industry?

More accurate survival time prediction can improve medical decision making (for example, by deciding whether a treatment option is cost-effective based on it’s added survival time, or by helping determine when a patient should be referred for end-of-life care.) The tool can be used more generally for any task that involved predicting a life-cycle (customer churn, diagnosing machine faults, etc).

Research

## Legal Reasoning

Principal Investigator:
Randy Goebel

###### Problem we’re trying to solve

A great deal of human resources are used in order to prepare for (or render a verdict in) a legal proceeding. The tools we are developing for information extraction and visualization in the legal domain extract legal concepts from text, identify chains of legal reasoning and answer questions by using textual entailment. Developing automated tools for information extraction and knowledge discovery will reduce the amount of effort and time needed for legal matters.

###### How will this help someone / an industry?

This project seeks to develop techniques for legal case reasoning, legal summarization and legal question answering in order to allow law practitioners to redirect their attention to tasks that require creativity or more complex reasoning.

###### Partners

National Institute for Informatics (Japan)

###### Type of MI used

Natural Language Processing, Information Extraction.

Research

## Ana – Automated Nursing Agent

Principal Investigator:
Osmar Zaïane

###### Problem we’re trying to solve

Ana, a conversational software agent (ie chatbot), is designed to converse with the elderly living at home to answer general questions and remind them of specific events. Ana is able to extract from conversations named entities (ie places, people, prescriptions, recipe names, etc.) as well as relationships (ie family tied, professions, activities, temporal events, etc.) Ana extracts information from text obtained from a speech-to-text converter; from this, it builds a personalized knowledge base that allows it to answer personal questions. Ana can also answer impersonal questions from sources on the Internet.

###### How will this help someone / an industry?

In addition to developing a speech interface for human-machine interaction, Ana seeks to improve elderly home care by providing a personal assistant and a digital companion. Ana helps with social needs (through questions and answers) and assists with simple home healthcare needs (ie. prescription reminders).

###### Type of MI used

Information Extraction, Natural Language Processing.

Research

Principal Investigator:
Michael Bowling

###### Problem we’re trying to solve

The Computer Curling Research Group focuses on deep analytics for the sport of curling for player analysis and system analysis, and to create tools that can translate AI discovered insights into improvements to human decision making.

###### How will this help someone / an industry?

The ultimate goal of this project is to develop tools and models that enable player/team assessment, strategic game modeling, and analytics for broadcast television.

###### Type of MI used

Deep learning, Search and Planning.

Research

## Intelligent Diabetes Management

Principal Investigator:
Russ Greiner

###### Problem we’re trying to solve

The current method of determining insulin dosages requires a patient to manually track their insulin levels multiple times a day, collect data over a certain period of time, present that data to a diabetologist, and have their dosage adjusted after weeks of using the wrong dose. This is delaying our ability to optimize treatment, depends on the patient’s commitment to tracking data, and requires a diabetologist to personally evaluate each case.

###### How will this help someone / an industry?

Machine learning is able to use patient data to adjust insulin levels in real time, making their treatment personalized, more accurate, and more affordable. It also increases the capacity for diabetologists to see more patients and help more people.

###### Partners

Alberta Diabetes Institute; top rated diabetologist from Alberta

###### Type of MI used

Reinforcement learning