Offline Reinforcement Learning for VENUS Control

The VENUS ion source at LBNL is the primary injector at the 88-Inch cyclotron, and serves as the prototype for DOE’s Facility of Rare Isotope Beams (FRIB). After two decades, it remains one of the highest performing ion sources in the world. Ions produced from VENUS are used for e.g. heavy element synthesis and space radiation testing.

Operation of VENUS and similar ion sources typically requires tedious manual tuning and optimization with the 10–20 settings, which is burdensome and takes up valuable time from human physicists. Here, the machine learning approach of reinforcement learning has the advantage that it can self-learn the rules from scratch, using just the “reward” signal — e.g. the current of the ion beam. But it has rarely been attempted with a major scientific instrument: Reinforcement learning takes valuable time away from the user, and because it “freely” explores the settings, it can potentially damage the instrument.

For the first time with an ion source, reinforcement learning was performed on VENUS. This new development was made possible by utilizing a novel offline variant of reinforcement learning that does not require access to VENUS, but learns from a logged database of past control settings and how VENUS reacted to them. The machine learning effort at VENUS has logged about 1000 hours of O⁷⁺ data in 2022 and 2023 where the computer set two settings at random: The voltage of the biased disk (an aluminum disk that empirically improves the source performance), and the amount of the injected oxygen. Using this data, the offline reinforcement learning deduced an algorithm in the form of a neural network, when applied to the 2022 and 2023 VENUS, would have called for the biased disk to be set to a voltage of 55 V.

During a weekend in 2025, the trained neural network was run on the real VENUS. It set VENUS stably at 57 V. Over the course of one hour, the neural network maintained VENUS at a stable beam output. This first result of its kind provides a path forward of applying reinforcement learning effectively at an operational ion source while maintaining minimal interference with physics operation. Members from the 88-Inch Cyclotron and Applied Nuclear Physics contributed to this work: Yue Shi Lai, Heather Crawford, Marco Salathe, Damon Todd.

Training progress, as monitored by the biased disk voltage over recorded VENUS states. The blue curve shows the biased disk voltage that the training is feeding the neural network at a given step (RL episode), whereas the red curve is the action the NN intends to take based on a past VENUS state. Credit: Courtesy of YSL

Tags:

Offline Reinforcement Learning for VENUS Control

Connect

Our Organization