Csaba Szepesvári

Csaba Szepesvári

Fellow & Canada CIFAR AI Chair

Academic Affiliations

Professor – University of Alberta (Computing Science); Principal Investigator – Reinforcement Learning & Artificial Intelligence Lab (University of Alberta)

Industry and Research Affiliations

Senior Staff Research Scientist & Team Lead of the Foundations team – DeepMind

Areas of Expertise

Reinforcement learning; online learning algorithms; exploration-exploitation dilemma; control of uncertain systems; planning, estimation and filtering problems in stochastic environments; stochastic approximation algorithms; Monte-Carlo estimation; Markov Decision Processes; learning theory

Csaba Szepesvári works on reinforcement learning theory, creating and analyzing algorithms that learn efficiently and effectively while interacting with their environments in a sequential manner.

Theoretical foundations of reinforcement learning

Csaba Szepesvári works on reinforcement learning theory, creating and analyzing algorithms that learn efficiently and effectively while interacting with their environments in a sequential manner. He is particularly interested in problems when a machine continuously interacts with its environment while trying to discover autonomously a good way of interacting with it. These interactive online learning problems are studied in various disciplines, such as within control theory under the name "dual control", or within machine learning itself in the area of reinforcement learning. Specific topics of research include computationally efficient and effective online learning and planning in large Markov Decision Processes, or with batch data; new algorithms for multicriteria reinforcement learning; efficient optimization and planning algorithms; and policy performance certificates.

Csaba is a Fellow and Canada CIFAR AI Chair at Amii and a Professor in the Computing Science Department of the University of Alberta. He is a Senior Staff Research Scientist at DeepMind in Edmonton, AB, leading the Foundations team. He is the Associate Editor of Mathematics of Operations Research and the Action Editor of the Journal of Machine Learning Research. Csaba is a Senior Member of the Institute of Electrical and Electronics Engineers and a member of the American Association for Artificial Intelligence. Csaba’s publications have received awards and accolades from top conferences such as the International Conference on Machine Learning (ICML), the Conference on Uncertainty in AI, and the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD) where he received the Test of Time award in 2016. Csaba has co-authored more than 225 publications, including a book on Bandit Algorithms, which was released in the summer of 2020.