Upper Bound Diary: Reinforcement Learning is ready for the spotlight

Published

May 20, 2026

I’m well aware that I’m pretty lucky.

As the Science Communicator here at Amii, my job basically boils down to learning as much as I can about the really fascinating advances in AI sciences being done by our researchers and staff scientists. For much of the year, it’s kind of like drinking from a firehouse of really interesting new things to learn.

But every year during Upper Bound, that firehouse becomes more like a geyser. A Yellowstone Park-level burst of stuff to learn about. Hundreds of sessions on AI, and only time to take in a few of them.

When I actually sat down to make a schedule of what to see, my first picks were some of the sessions on reinforcement learning that were slated for the first day.

RL is something we talk a lot about here at Amii. And more and more, that conversation is about the impact it can have when applied to real-world problems.

RL in the Real World

That’s why I started my morning off by popping in to see Adam White’s talk, “Reinforcement Learning in the Real World.” In addition to being an engaging speaker and just a really nice guy, Adam is the guy you want to talk to about real-world RL. He’s an Amii Fellow, a Canada CIFAR AI Chair, and Principal Investigator at the Reinforcement Learning and Artificial Intelligence Lab at the University of Alberta.

Adam, along with Amii Fellow and Canada CIFAR AI Chair Martha White, is one of the co-founders of RL Core Technologies, a startup that is showing just how AI can be used in industrial settings, specifically water treatment.

Being a big geek for both RL and infrastructure stuff, I’ve been following the work of RL Core closely for a while. So, I’ve heard Adam talk a few times about how he thinks RL is the next step forward for industries like utilities and manufacturing. But this was the first time I’ve heard him say that real-world RL is also the next big thing for advancing the science itself.

“RL needs real-world problems - it will lead to better RL algorithms,” he told the crowd that had packed the room.

He argues that there have been a lot of really exciting advancements made in reinforcement learning in labs and simulations. But the real world is a different beast entirely - as he notes, real life is messier, data is harder to come by. As well, a lot of the RL models that are used in labs are overspecialized. They can be endlessly tuned and adjusted to be very successful on very specific problems or to perform well on very specific tests, but they don’t generalize as well as they should  - something he called “the greatest sin of our field.”

When he talks about real-world deployment, he makes it sound like the ML equivalent of a blast furnace. A high temperature environment, but one that will forge stronger RL models and approaches that can have a real impact in a dynamic, unpredictable world.

Then he shared some lessons from three real-world RL projects he’s been involved in, all of them showing different sides of the challenges that he was walking about, and the real benefit that RL can have. In addition to the water treatment work, he’s got a few other really interesting applications, things like predicting solar storms with satellite data, and measuring how different lighting conditions affect plant growth. Both of those are in early stages, but they showed off a couple more of the exciting ways that reinforcement learning is tackling real-world problems.

Learning in an ever-changing world

A couple of hours later, I was able to pop in on another short session with another real-world RL example. This time, the speaker was  Soumya Ranjan Sahoo, a machine learning researcher at NTWIST, a software company servicing mining and manufacturing companies. He was showcasing the success his company has seen with combining RL with other machine learning technologies in job shop scheduling.

He explains that for a company manufacturing complicated parts, scheduling is a massive consideration. A part will likely need to be worked on by several machines, which takes a certain amount of time and has to be done in a certain order. And all those machines usually need to be worked by a person, who might also be needed elsewhere. So, the company has developed ML algorithms to schedule the process, aiming for as little machine downtime and conflicts as possible.

But, as Adam White said earlier in the day, the real world is messy. NTWISTS system is pretrained on a vast dataset of successful schedules, and it works well if everything stays the same.  But nothing ever stays the same. Machines break down, workers change shifts and take holidays, or a million other small things conspire to change the plan. That’s where the reinforcement learning side of things comes in. It is able to learn continually, adjusting for unexpected delays and keeping things on track when the situation changes. Adding the RL component made their models work five times better than they had without it, Sahoo says.

Using this hybrid approach, he says they have seen some pretty amazing results, far better than if they had used a pure pre-trained or pure-RL solution. In one anonymized case study, he talked about a drill head manufacturer they had worked with that was able to cut their lead time for manufacturing from around 56 days down to just twelve, which seemed like quite a result.

Those were just a couple of the RL sessions that I was able to catch on the first day of Upper Bound. But even that was enough to show that reinforcement learning is no longer an abstract, someday technology. Both researchers and companies are putting it to use in actual, real-world applications. And that’s not only solving problems today, but it is also teaching us so much about reinforcement learning itself, getting us closer to even more exciting applications. A technology of both today and tomorrow.

Share