A central challenge in integrating massive distributed energy resources (DERs) into the distribution grid is voltage control, where rooftop solar and EV charging can result in large and hard-to-predict voltage violations in a feeder.Â
Existing control algorithms, often designed to handle slow variations using discrete devices such as tap-changing transformers or implement simple control laws (e.g., droop control)--become inadequate for future grids with rapid fluctuations and complex dynamics introduced by DERs. At the same time, the increase in data from sensing and monitoring devices presents opportunities for learning-based control to address these challenges. However, the lack of performance guarantees hinders their practical deployment.
Learning with stability guarantees. We develop reinforcement learning (RL)-based controllers for distribution grid voltage control with stability guarantees. Specifically, the key structure we identify is monotonic functions. These are parameterized by what we call "monotone neural networks" to guarantee transient stability by design and trained through RL for performance optimization. We further synthesize the RL controller with the gradient-based controller to obtain steady-state optimality. Our stability-constrained RL framework with monotone neural network controllers can be generalized to power system frequency control and broader networked systems with passivity.Â
Formulate the power system control problem via an RL framework (state, action, reward, transition dynamics): minimize accumulated control cost subject the system dynamics
Identify the stability constraints via Lyapunov stability theory. Specifically, one can construct Lyapunov function manually or leverage our recently developed generative AI tool [C5].
Incorporate the stability constraint into RL: 1) For distribution grid voltage control and networked systems with equilibrium independent passivity (EIP), we identify a class of stabilizing policy structure - monotone policy - that can guarantee stability by design and perform standard RL; 2) For general nonlinear systems, one can incorporate the stability constraints derived from (2) into (1) and solve the constrained RL problem.
Train the controller parameter via reinforcement learning algorithms (e.g., DDPG, actor-critics, etc)
Journal Papers
[J1] Jie Feng, Wenqi Cui, Jorge Cortes, and Yuanyuan Shi, “Bridging Transient and Steady-State Performance in Voltage Control: A Reinforcement Learning Approach with Safe Gradient Flow”, IEEE Control Systems Letters, 2023.
[J2] Jie Feng, Yuanyuan Shi, Guannan Qu, Steven Low, Anima Anandkumar, and Adam Wierman, “Stability Constrained Reinforcement Learning for Real-Time Voltage Control in Distribution Systems”, IEEE Transactions on Control of Network Systems, vol. 11, no. 3, pp. 1370-1381, 2024.
[J3] Chris Yeh, Jing Yu, Yuanyuan Shi, and Adam Wierman, "Online Learning for Robust Voltage Control Under Uncertain Grid Topology," in IEEE Transactions on Smart Grid, vol. 15, no. 5, pp. 4754-4764, 2024.
[J4] Jie Feng, Manasa Muralidharan, Rodrigo Henriquez-Auba, Patricia Hidalgo-Gonzalez, and Yuanyuan Shi, "Stability-Constrained Learning for Frequency Regulation in Power Grids With Variable Inertia," in IEEE Control Systems Letters, vol. 8, pp. 994-999, 2024.
[J5] Jie Feng, Wenqi Cui, Jorge Cortés, and Yuanyuan Shi, "Online Event-Triggered Switching for Frequency Control in Power Grids With Variable Inertia", in IEEE Transactions on Power Systems, vol. 40, no. 4, pp. 3347-3360, 2025.
[J6] Zhenyi Yuan, Jie Feng, Yuanyuan Shi, and Jorge Cortés, "Stability Constrained Voltage Control in Distribution Grids with Arbitrary Communication Infrastructure", under review, 2025.
Refereed Conference Papers
[C1] Yuanyuan Shi, Guannan Qu, Steven Low, Anima Anandkumar, and Adam Wierman, “Stability Constrained Reinforcement Learning for Real-Time Voltage Control", in American Control Conference (ACC), 2022.
[C2] Chris Yeh, Jing Yu, Yuanyuan Shi, and Adam Wierman, "Robust Online Voltage Control with an Unknown Grid Topology", in ACM International Conference on Future Energy Systems (ACM e-Energy), 2022. (Best Paper Finalist)
[C3] Sahin Lale, Yuanyuan Shi, Guannan Qu, Kamyar Azizzadenesheli, Adam Wierman, and Anima Anandkumar, "KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed Stability in Nonlinear Dynamical Systems", in IEEE Conference on Decision and Control (CDC), 2023.
[C4] Wenqi Cui, Yan Jiang, Baosen Zhang, and Yuanyuan Shi, "Structured Neural-PI Control for Networked Systems: Stability and Steady-State Optimality Guarantees", in Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS), 2023.Â
[C5] Haohan Zou, Jie Feng, Hao Zhao, and Yuanyuan Shi, "Analytical Lyapunov Function Discovery: An RL-based Generative Approach", in Forty-Second International Conference on Machine Learning (ICML), 2025.
Work by other authors
[A1] W. Cui, Y. Jiang and B. Zhang, "Reinforcement Learning for Optimal Primary Frequency Control: A Lyapunov Approach," in IEEE Transactions on Power Systems, vol. 38, no. 2, pp. 1676-1688, 2023.
[A2] W. Cui, J. Li, and B. Zhang, "Decentralized Safe Reinforcement Learning for Inverter-based Voltage Control", Electric Power Systems Research, Volume 211, 2022.
[A3] W. Cui, and B. Zhang, "Lyapunov-Regularized Reinforcement Learning for Power System Transient Stability", IEEE Control Systems Letters, vol. 6, pp. 974-979, 2022.
[A4] Z. Yuan, C. Zhao, and J. Cortés, "Reinforcement Learning for Distributed Transient Frequency Control with Stability and Safety Guarantees", Systems & Control Letters, Volume 185, 2024.
[A5] Z. Yuan, G. Cavraro, M. K. Singh and J. Cortés, "Learning Provably Stable Local Volt/Var Controllers for Efficient Network Operation," in IEEE Transactions on Power Systems, vol. 39, no. 1, pp. 2066-2079, 2024.
[A6] G. Cavraro, Z. Yuan, M. K. Singh and J. Cortés, "Learning Local Volt/Var Controllers Towards Efficient Network Operation with Stability Guarantees," IEEE Conference on Decision and Control (CDC), 2022.
Stability-constrained RL for distribution grid voltage control
Top figure shows that an unconstrained RL controller can do well during training, but once it is actually implemented, the system becomes unstable as shown in the bottom figure.
The possibility of instability – created by the controllers themselves–is a deal-breaker for real-world deployment!
Hardware-in-the-loop Validation via the UCSD DERConnect Testbed
UCSD DERConnect testbed: IEEE 13-bus simulation with RTDS simulator.Â
Stability-constrained voltage control under high/low voltage disturbance.
Applications to broader networked systems with passivity and power system frequency controlÂ
Figure: (a) The controller aims to improve transient performances after disturbances while guaranteeing stabilization to the desired steady-state output $\Bar{y}$. (b) Two examples of physical systems with output tracking tasks. Exp1: Inverted pendulum track a required inverted angle $\Bar{y}$. Exp2:Â A vehicle platoon tracks the setpoint velocity. (c) We provide end-to-end stability and output tracking guarantees by enforcing stabilizing PI structure in the design of neural networks. The key structure is strictly monotone functions, which are parameterized by the gradient of strictly convex neural networks. (d) The transient performance is optimized by training the neural networks.Â
NSF ECCS-2200692 and ECCS-2442689
Early-Career Faculty Acceleration Award