Arcade Learning Environment

Principal Investigator
Michael Bowling

Atari 2600 Platform for General Artificial Intelligence Development

The Arcade Learning Environment (ALE) is a software framework designed to facilitate the development and testing of general AI agents. ALE was created as both a challenge problem for AI researchers and a method for developing and evaluating innovations and technologies.

Historically, many AI advancements have been developed and tested through games (e.g. CheckersChessPoker and, most recently, Go), which offer controlled, well-understood environments with easily defined measures for success. Games also give researchers a concrete and relatable way to demonstrate artificial intelligence to a broad audience.

The Arcade Learning Environment, powered by the Stella Atari Emulator, provides an interface to hundreds of Atari 2600 games. These diverse game environments are complex enough to be challenging for an AI agent to solve yet simple enough to enable progress.

ALE is available to researchers and hobbyists alike with Atari now being used by groups like Google DeepMind to develop and test their deep reinforcement learning methodologies.

See Also

The Arcade Learning Environment: An Evaluation Platform for General Agents” published in the Journal of Artificial Intelligence Research



Principal Investigator
Michael Bowling

Problem we’re trying to solve

For several years, AI researchers have had a number of different techniques for predicting and planning optimal actions in situations of perfect information (where all actors have the same, full knowledge of the world). Techniques have been lacking for dealing with imperfect information situations (where actors do not have access to certain information or have access to information the other doesn’t). DeepStack seeks to successfully apply, for the first time, theoretical techniques for perfect information games into situations with imperfect information.

How will this help someone / an industry?

For computing scientists and AI researchers, DeepStack represents a foundational step forward in dealing with issues around predicting optimal actions in the face of ambiguity and uncertainty. The theoretical advancements demonstrated in DeepStack will open new avenues of research for scientists interested in building, and planning with, models of unknown, complex dynamic systems.

Type of MI used

Reinforcement learning, Deep learning


Advanced Analytics for Curling

Principal Investigator:
Michael Bowling

Problem we’re trying to solve

The Computer Curling Research Group focuses on deep analytics for the sport of curling for player analysis and system analysis, and to create tools that can translate AI discovered insights into improvements to human decision making.

How will this help someone / an industry?

The ultimate goal of this project is to develop tools and models that enable player/team assessment, strategic game modeling, and analytics for broadcast television.

Type of MI used

Deep learning, Search and Planning.

UAlberta Expertise Brings DeepMind Lab to Edmonton

In an historic move for the AI community, one of the world’s leading AI research companies, DeepMind, will open its first international research base outside the United Kingdom later this month. The lab will be based in Edmonton and have close ties to the University of Alberta, a research-intensive university with an illustrious record of AI research excellence.

The new lab, to be called DeepMind Alberta, demonstrates DeepMind’s commitment to accelerating Alberta’s and Canada’s AI research community. It also signals the strength of ties between the University of Alberta and one of the world’s leading AI companies. Having been acquired by Google in 2014, DeepMind is now part of Alphabet. DeepMind is on a scientific mission to push the boundaries of AI, developing programs that can learn to solve complex problems without being taught how. DeepMind Alberta will open with 10 employees.

The DeepMind Alberta team will be led by UAlberta computing science professors Richard Sutton, Michael Bowling, and Patrick Pilarski. All three, who will remain with the Alberta Machine Intelligence Institute at UAlberta, will also continue teaching and supervising graduate students at the university to further foster the Canadian AI talent pipeline and grow the country’s technology ecosystem. The team will be completed by seven more researchers, many of whom were also authors on the influential DeepStack paper published earlier this year in Science.

UAlberta’s connections to DeepMind run deep with roughly a dozen UAlberta alumni already working at the company, some of whom played important roles in some of DeepMind’s signature advances with reinforcement learning in AlphaGo and Atari. In addition, one of the world’s most renowned computing scientists, Sutton was DeepMind’s first advisor when the company was just a handful of people.

“I first met with Rich—our first ever advisor—seven years ago when DeepMind was just a handful of people with a big idea. He saw our potential and encouraged us from day one. So when we chose to set up our first international AI research office, the obvious choice was his base in Edmonton, in close collaboration with the University of Alberta, which has become a leader in reinforcement learning research thanks to his pioneering work,” said Demis Hassabis, CEO and co-founder of DeepMind. “I am very excited to be working with Rich, Mike, Patrick and their team, together with UAlberta, and I look forward to us making many more scientific breakthroughs together in the years ahead.”

Sutton is excited about the opportunity to combine the strength of DeepMind’s work in reinforcement learning with UAlberta’s academic excellence, all without having to leave Edmonton.

“DeepMind has taken this reinforcement learning approach right from the very beginning, and the University of Alberta is the world’s academic leader in reinforcement learning, so it’s very natural that we should work together,” said Sutton. “And as a bonus, we get to do it without moving.”

Working with Hassabis and the DeepMind team both in London and Edmonton, Sutton, Bowling, and Pilarski will combine their staggering academic strength in reinforcement learning to focus on basic AI research. Reinforcement learning functions similarly to the same way humans learn, trying to replicate good outcomes and avoid bad outcomes based on learned experiences.

The DeepMind Alberta announcement is the latest in a slate of AI-related successes for UAlberta. The recent major funding infusion via the federal government’s Pan-Canadian Artificial Intelligence Strategy strengthens the Alberta government’s 15-year investment of more than $40 million. DeepMind Alberta is a further signal that industry is taking notice of UAlberta and its boundary-pushing research.

About the Researchers

A professor in the Department of Computing Science in the University of Alberta’s Faculty of Science, Michael Bowling is best known for his research in poker, most notably with two milestone discoveries, both published in Science, Cepheus in 2015, which solved heads-up limit Texas hold’em followed by DeepStack in late 2016, which achieves professional-level play in heads-up no limit Texas hold’em.

Patrick Pilarski is the Canada Research Chair in Machine Intelligence for Rehabilitation and an assistant professor in the Department of Medicine (Division of Physical Medicine and Rehabilitation). His research interests include reinforcement learning, real-time machine learning, human-machine interaction, rehabilitation technology, and assistive robotics.

A professor in the Department of Computing Science in the University of Alberta’s Faculty of Science, Richard Sutton is world-renowned for his foundational research in reinforcement learning –he literally wrote the textbook–in which machines learn based on their environment. His landmark work has developed the area of temporal difference learning, which uses the future as a source of information for predictions, and also explores off-policy learning, or learning from actions not taken.

University of Alberta computing science professors and artificial intelligence researchers (L to R) Richard Sutton, Michael Bowling, and Patrick Pilarski are working with DeepMind to open the AI powerhouse company’s first research lab outside the United Kingdom in Edmonton, Canada.
Credit: John Ulan
Skill Trumps Luck: DeepStack the First Computer Program to Outplay Human Professionals at Heads-Up No-Limit Texas Hold’em Poker

EDMONTON (March 2, 2017)—A team of computing scientists from the University of Alberta’s Computer Poker Research Group is once again capturing the world’s collective fascination with artificial intelligence. In a historic result for the flourishing AI research community, the team—which includes researchers from Charles University and Czech Technical University in Prague—has developed an AI system called DeepStack that defeated professional poker players in December 2016.  The landmark findings have just been published in Science, one of the world’s most prestigious peer-reviewed scientific journals.

DeepStack bridges the gap between approaches used for games of perfect information—like those used in checkers, chess, and Go—with those used for imperfect information games, reasoning while it plays using “intuition” honed through deep learning to reassess its strategy with each decision.

“Poker has been a longstanding challenge problem in artificial intelligence,” says Michael Bowling, professor in the University of Alberta’s Faculty of Science and principal investigator on the study. “It is the quintessential game of imperfect information in the sense that the players don’t have the same information or share the same perspective while they’re playing.”

Don’t let the name fool you: imperfect information games are serious business. These “games” are a general mathematical model that describe how decision-makers interact. Artificial intelligence research has a storied history of using parlour games to study these models, but attention has been focused primarily on perfect information games. “We need new AI techniques that can handle cases where decision-makers have different perspectives,” says Bowling, explaining that developing techniques to solve imperfect information games will have applications well beyond the poker table.

“Think of any real world problem. We all have a slightly different perspective of what’s going on, much like each player only knowing their own cards in a game of poker.” Immediate applications include making robust medical treatment recommendations, strategic defense planning, and negotiation.

This latest discovery builds on an already impressive body of research findings about artificial intelligence and imperfect information games that stretches back to the creation of the University of Alberta’s Computer Poker Research Group in 1996. Bowling, who became the group’s principal investigator in 2006, has led the group to several milestones for artificial intelligence. He and his colleagues developed Polaris in 2008, beating top poker players at heads-up limit Texas hold’em poker. They then went on to solve heads-up limit hold’em with Cepheus, published in 2015 in Science.

DeepStack extends the ability to think about each situation during play—which has been famously successful in games like checkers, chess, and Go—to imperfect information games using a technique called continual re-solving. This allows DeepStack to determine the correct strategy for a particular poker situation without thinking about the entire game by using its “intuition” to evaluate how the game might play out in the near future.

“We train our system to learn the value of situations,” says Bowling. “Each situation itself is a mini poker game. Instead of solving one big poker game, it solves millions of these little poker games, each one helping the system to refine its intuition of how the game of poker works.  And this intuition is the fuel behind how DeepStack plays the full game.”

Thinking about each situation as it arises is important for complex problems like heads-up no-limit hold’em, which has vastly more unique situations than there are atoms in the universe, largely due to players’ ability to wager different amounts including the dramatic “all-in.” Despite the game’s complexity, DeepStack takes action at human speed—with an average of only three seconds of “thinking” time—and can run on a simple gaming laptop using an Nvidia GPU for computation.

To test the approach, DeepStack played against a pool of professional poker players in December, 2016, recruited by the International Federation of Poker. Thirty-three players from 17 countries were recruited, with each asked to play a 3000-hand match over a period of four weeks. DeepStack beat each of the 11 players who finished their match, with only one outside the margin of statistical significance, making it the first computer program to beat professional players in heads-up no-limit Texas hold’em poker.

“DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker” was published online by the journal Science on Thursday, March 2, 2017.