储能科学与技术

• XXXX •    

基于强化学习 -模型预测控制( RL-MPC)的分布式储能协同一次调频控制方法

马骞1, 肖亮1, 程冰2, 高琴1, 刘春晓1, 朱益华3, 李成翔3   

  1. 1.中国南方电网电力调度控制中心,广东 广州 510530
    2.海南电网有限责任公司,海南 海口 570203
    3.直流输电技术全国重点实验室(南方电网科学研究院有限责任公司),广州 510663
  • 收稿日期:2025-03-27 修回日期:2025-04-30
  • 基金资助:
    南方电网有限责任公司科技项目(ZDKJXM20222007)

Cooperative primary frequency modulation control method for distributed energy storage based on reinforcement learning-model predictive control (RL-MPC)

Qian Ma1, Liang Xiao1, Bing Cheng2, Qin Gao1, Chunxiao Liu1, Yihua Zhu3, Chengxiang Li3   

  1. 1.China Southern Power Grid Power Dispatching and Control Center, Guangzhou 510530, China
    2.Hainan Power Grid Company Limited, Haikou 570203, China
    3.State Key Laboratory of HVDC (Electric Power Research Institute of China Southern Power Grid Company Limited), Guangzhou, 510663, China
  • Received:2025-03-27 Revised:2025-04-30

摘要:

为更好改善配电网频率特性,充分发挥分布式储能系统的快速响应优势,提出了一种基于强化学习-模型预测控制(Reinforcement Learning-Model predictive control,RL-MPC)的分布式储能协同一次调频控制方法。首先根据分布式储能的频率响应特性、荷电状态(SOC)、功率控制策略,建立了含分布式储能并网的一次调频控制模型;然后通过构建分层混合控制架构,上层采用深度强化学习(Deep Reinforcement Learning,DRL)动态优化MPC权重矩阵,实时感知频率偏差、变化率及储能荷电状态(SOC)分布熵值,下层分布式MPC滚动求解多节点储能出力序列,并引入图注意力网络(Graph Attention Network,GAT)实现通信拓扑自适应优化,降低分布式储能协同控制的计算复杂度,提升策略泛化能力;最后通过Matlab/Simulink仿真验证了所提方法能够有效提升分布式储能的一次调频响应速度和控制精度,增强了电力系统的稳定性。

关键词: 分布式储能, 一次调频, 模型预测控制, 强化学习, 图注意力网络

Abstract:

In order to improve the frequency characteristics of the power grid and fully leverage the rapid response advantages of distributed energy storage systems, a method for collaborative primary frequency control based on Reinforcement Learning-Model Predictive Control (RL-MPC) has been proposed. First, a primary frequency control model incorporating distributed energy storage was established based on the frequency response characteristics, state of charge (SOC), and power control strategies of distributed energy storage. Then, by constructing a hierarchical mixed control architecture, the upper layer uses Deep Q-Network (DQN) to dynamically optimize the MPC weight matrix, while real-time sensing the frequency deviation, rate of change, and distribution entropy value of the SOC. The lower layer utilizes distributed MPC to roll out the output sequence of multi-node energy storage and introduces a Graph Attention Network (GAT) to achieve adaptive optimization of the communication topology, reducing the computational complexity of collaborative control of distributed energy storage and enhancing the generalization capability of the strategy. Finally, simulations conducted in Matlab/Simulink have verified that the proposed method can effectively enhance the primary frequency response speed and control accuracy of distributed energy storage, thereby strengthening the stability of the power system.

Key words: Distributed energy storage, Frequency modulation, Model predictive control, Reinforcement Learning

中图分类号: