储能科学与技术 ›› 2025, Vol. 14 ›› Issue (8): 3138-3148.doi: 10.19799/j.cnki.2095-4239.2025.0296

• 储能系统与工程 • 上一篇    

基于强化学习-模型预测控制(RL-MPC)的分布式储能协同一次调频控制方法

马骞1(), 肖亮1, 程冰2, 高琴1, 刘春晓1, 朱益华3, 李成翔3   

  1. 1.中国南方电网电力调度控制中心,广东 广州 510530
    2.海南电网有限责任公司,海南 海口 570203
    3.直流输电技术全国重点实验室(南方电网科学研究院有限责任公司),广东 广州 510663
  • 收稿日期:2025-03-27 修回日期:2025-04-30 出版日期:2025-08-28 发布日期:2025-08-18
  • 通讯作者: 马骞 E-mail:maqian@csg.cn
  • 作者简介:马骞(1978—),男,博士,教授级高级工程师,研究方向为电网运行策划、新能源并网运行、系统运行风险防控、规划运行分析,E-mail:maqian@csg.cn
  • 基金资助:
    南方电网有限责任公司科技项目(ZDKJXM20222007)

Cooperative primary frequency modulation control method for distributed energy storage based on reinforcement learning-model predictive control

Qian MA1(), Liang XIAO1, Bing CHENG2, Qin GAO1, Chunxiao LIU1, Yihua ZHU3, Chengxiang LI3   

  1. 1.China Southern Power Grid Power Dispatching and Control Center, Guangzhou 510530, Guangdong, China
    2.Hainan Power Grid Company Limited, Haikou 570203, Hainan, China
    3.State Key Laboratory of HVDC (Electric Power Research Institute of China Southern Power Grid Company Limited), Guangzhou 510663, Guangdong, China
  • Received:2025-03-27 Revised:2025-04-30 Online:2025-08-28 Published:2025-08-18
  • Contact: Qian MA E-mail:maqian@csg.cn

摘要:

为改善配电网频率特性,充分发挥分布式储能系统的快速响应优势,提出了一种基于强化学习-模型预测控制(reinforcement learning-model predictive control, RL-MPC)的分布式储能协同一次调频控制方法。首先根据分布式储能的频率响应特性、荷电状态(SOC)、功率控制策略,建立了含分布式储能并网的一次调频控制模型;然后通过构建分层混合控制架构,上层采用深度强化学习(deep reinforcement learning, DRL)动态优化MPC权重矩阵,实时感知频率偏差、变化率及储能荷电状态分布熵值,下层采用分布式MPC滚动求解多节点储能出力序列,并引入图注意力网络(graph attention network, GAT)实现通信拓扑自适应优化,降低分布式储能协同控制的计算复杂度,提升策略泛化能力;最后通过Matlab/Simulink仿真验证了所提方法能够有效提升分布式储能的一次调频响应速度和控制精度,增强电力系统的稳定性。

关键词: 分布式储能, 调频, 模型预测控制, 强化学习, 图注意力网络

Abstract:

To enhance the frequency characteristics of power grids and fully leverage the rapid response advantages of distributed energy storage systems (DESSs), a cooperative primary frequency control method based on reinforcement learning-model predictive control (RL-MPC) is proposed. First, a primary frequency control model incorporating DESSs is established based on frequency response characteristics, state of charge (SOC), and power control strategies. Then, a hierarchical mixed control architecture is designed: the upper layer employs a deep Q-network (DQN) to dynamically optimize the MPC weight matrix while sensing frequency deviation, rate of change, and SOC distribution entropy in real time. The lower layer utilizes distributed MPC to determine the output sequences of multi-node energy storage units and introduces a graph attention network (GAT) to achieve adaptive optimization of the communication topology. This approach reduces computational complexity in coordinated control and enhances the strategy's generalization capability. Finally, simulations conducted in Matlab/Simulink verify that the proposed method effectively improves the primary frequency response speed and control accuracy of DESSs, thereby strengthening overall power system stability.

Key words: distributed energy storage, frequency modulation, model predictive control, reinforcement learning, graph attention network

中图分类号: