Reinforcement learning

How to support decision making with reinforcement learning in hierarchical chemical process control?

In this review article, we explore the application of reinforcement learning (RL) at the different levels of hierarchical chemical process control, where reinforcement learning can improve efficiency and robustness in chemical process operations. RL algorithms are an optimal method for sequential decision making, therefore in chemical process control, where taking decisions is required continuously, RL can be a perfect fit due to its ability to handle dynamic, nonlinear, and uncertain environments. Reinforcement learning has already shown great potential in solving complex tasks, making it a promising approach for the challenges of chemical process control.

We investigate the potential of reinforcement learning compared to traditional control methods. We present advanced multi-agent structures of RL, which can tackle large- scale chemical processes beyond the capabilities of a single agent. We introduce CRISP-RL (CRoss-Industry Standard Process for the development of Reinforcement Learning application), which is a paradigm that aims to deploy and maintain reinforcement learning projects, providing a methodology to handle and solve complex RL tasks and describe the current challenges and future directions for the integration of reinforcement learning into chemical process control.

Szatmári, K., Chován, T., Németh, S., & Kummer, A. (2025). How to support decision making with reinforcement learning in hierarchical chemical process control?. Chemical Engineering Journal Advances, 100753.

Resilience-based explainable reinforcement learning in chemical process safety

For future applications of artificial intelligence, namely reinforcement learning (RL), we develop a resilience-based explainable RL agent to make decisions about the activation of mitigation systems. The applied reinforcement learning algorithm is Deep Q-learning and the reward function is resilience. We investigate two explainable reinforcement learning methods, which are the decision tree, as a policy-explaining method, and the Shapley value as a state-explaining method.

The policy can be visualized in the agent’s state space using a decision tree for better understanding. We compare the agent’s decision boundary with the runaway boundaries defined by runaway criteria, namely the divergence criterion and modified dynamic condition. Shapley value explains the contribution of the state variables on the behavior of the agent over time. The results show that the decisions of the artificial agent in a resilience-based mitigation system can be explained and can be presented in a transparent way.

Szatmári K, Horváth G, Németh S, Bai W, Kummer A. Resilience-based explainable reinforcement learning in chemical process safety. Computers & Chemical Engineering. 2024 Aug 24:108849.

Generally Applicable Q-Table Compression Method and Its Application for Constrained Stochastic Graph Traversal Optimization Problems

We analyzed a special class of graph traversal problems, where the distances are stochastic, and the agent is restricted to take a limited range in one go. We showed that both constrained shortest Hamiltonian pathfinding problems and disassembly line balancing problems belong to the class of constrained shortest pathfinding problems, which can be represented as mixed-integer optimization problems. Reinforcement learning (RL) methods have proven their efficiency in multiple complex problems. However, researchers concluded that the learning time increases radically by growing the state- and action spaces. In continuous cases, approximation techniques are used, but these methods have several limitations in mixed-integer searching spaces. We present the Q-table compression method as a multistep method with dimension reduction, state fusion, and space compression techniques that project a mixed-integer optimization problem into a discrete one. The RL agent is then trained using an extended Q-value-based method to deliver a human-interpretable model for optimal action selection. Our approach was tested in selected constrained stochastic graph traversal use cases, and comparative results are shown to the simple grid-based discretization method.

Tamás Kegyes, Alex Kummer, Zoltán Süle, János Abonyi. "Generally Applicable Q-Table Compression Method and Its Application for Constrained Stochastic Graph Traversal Optimization Problems." Information 15 (2024)

Integration of resilience engineering and reinforcement learning in chemical process safety

Exothermic reactions carried out in batch reactors need a lot of attention to operate because any insufficient condition can lead to thermal runaway causing an explosion in the worst case. Therefore, a well-designed intervention action is necessary to avoid non-desired events. For this problem, we propose to use resilience-based reinforcement learning, where the artificial agent can decide whether to intervene or not based on the current state of the system. One of our goals is to design resilient systems, which means designing systems that can recover after a disruption. Therefore, we developed the resilience calculation method for reactors, where we suggest the use of dynamic predictive time to failure and recover to better resilience evaluation. Moreover, if the process state is out of the design parameters, then we do not suggest calculating with the adaptation and recovery phase. We suggest using Deep Q-learning to learn when to intervene in the system to avoid catastrophic events, where we propose to use the resilience metric as a reward function for the learning process. The results show that the proposed methodology is applicable to develop resilient-based mitigation systems, and the agent can effectively distinguish between normal and hazardous states

Szatmári, Kinga, Sándor Németh, and Alex Kummer. "Integration of resilience engineering and reinforcement learning in chemical process safety." Process Safety and Environmental Protection 181 (2024): 343-353.

Multi-agent reinforcement learning-based exploration of optimal operation strategies of semi-batch reactors

The operation of semi-batch reactors requires caution because the feeding reagents can accumulate, leading to hazardous situations due to the loss of control ability. This work aims to develop a method that explores the optimal operational strategy of semi-batch reactors. Since reinforcement learning (RL) is an efficient tool to find optimal strategies, we tested the applicability of this concept. We developed a problem-specific RL-based solution for the optimal control of semi-batch reactors in different operation phases. The RL-controller varies the feeding rate in the feeding phase directly, while in the mixing phase, it works as a master in a cascade control structure. The RL-controllers were trained with different neural network architectures to define the most suitable one. The developed RL-based controllers worked very well and were able to keep the temperature at the desired setpoint in the investigated system. The results confirm the benefit of the proposed problem-specific RL-controller.

Sass, Ádám, Alex Kummer, and János Abonyi. "Multi-agent reinforcement learning-based exploration of optimal operation strategies of semi-batch reactors." Computers & Chemical Engineering 162 (2022): 107819.

Page updated

Google Sites

Report abuse