储能科学与技术 ›› 2025, Vol. 14 ›› Issue (5): 1982-1990.doi: 10.19799/j.cnki.2095-4239.2024.1130

• 储能系统与工程 • 上一篇    下一篇

基于强化学习的变参数PID的惯量飞轮有功控制策略

萨仁高娃1(), 邬超慧1, 倪泽龙2, 张悦1, 姜新建2(), 田建宇2   

  1. 1.内蒙古电力(集团)有限责任公司内蒙古电力经济技术研究院分公司,内蒙古自治区 呼和浩特 010000
    2.清华大学,北京 100084
  • 收稿日期:2024-11-27 修回日期:2024-12-02 出版日期:2025-05-28 发布日期:2025-05-21
  • 通讯作者: 姜新建 E-mail:1764861384@qq.com;jiangxj@mail.tsinghua.edu.cn
  • 作者简介:萨仁高娃(1976—),女,学士,高级工程师,从事电力系统、机电一体化研究,E-mail:1764861384@qq.com
  • 基金资助:
    内蒙古电力集团(有限)责任公司科技项目资助(2024-4-59)

A variable-parameter PID active power control strategy of inertial flywheel based on reinforcement learning

Rengaowa SA1(), Chaohui WU1, Zelong NI2, Yue ZHANG1, Xinjian JIANG2(), Jianyu TIAN2   

  1. 1.Inner Mongolia Electric Power (Group) Co. , Ltd. Inner Mongolia Electric Power Economic and Technological Research Institute Branch, Hohhot 010000, Inner Mongolia Autonomous Region, China
    2.Tsinghua University, Beijing 100084, China
  • Received:2024-11-27 Revised:2024-12-02 Online:2025-05-28 Published:2025-05-21
  • Contact: Xinjian JIANG E-mail:1764861384@qq.com;jiangxj@mail.tsinghua.edu.cn

摘要:

本工作对基于电磁耦合器的惯量飞轮系统进行了研究,首先介绍了惯量飞轮系统的拓扑结构、原理,并说明了采用电磁耦合器的优势,然后对惯量飞轮系统进行数学建模。由于传统的定参数PID控制方式在系统有功指令突变的时候,输出功率会发生较大的波动,因此本工作提出了一种基于强化学习的变参数PID有功控制策略。在该控制策略中,PID参数是通过无模型参考的强化学习算法训练的神经网络RL Agent得到的,神经网络的输入量是有功功率的偏差、有功功率的微分、转速、转速的微分,输出量是P、I、D三个参数,当系统状态发生变化的时候,PID参数也会随之改变。为了验证该控制策略的可行性与控制性能的优势,在MATLAB/Simulink仿真平台上对该控制策略进行了与传统的定参数PID控制方式的对比验证,仿真结果表明,变参数PID控制策略中的P、I参数在系统收到有功功率调节指令时都有明显的变化,导致输出转矩的参考值发生了改变,从而使得系统功率输出的超调量和波动更小,动态响应性能更好。

关键词: 飞轮储能, 电磁耦合器, 强化学习, 矢量控制, PID控制

Abstract:

This study investigates an inertial flywheel system based on electromagnetic couplers. The topological structure and principle of the inertial flywheel system are first introduced, and the advantages of using electromagnetic couplers are highlighted, followed by mathematical modeling of the system. The traditional fixed-parameter proportional-integral-derivative (PID) control mode exhibits significant output power fluctuations during sudden active command changes. To address this limitation, this study proposes a variable-parameter PID active power control strategy for inertial flywheels based on reinforcement learning (RL). The proposed method uses an RL algorithm without a model reference to train the neural network RL Agent, which dynamically adjusts the PID parameters. The neural network processes four input variables: active power deviation, differentiation of the active power, rotational speed, and acceleration, while outputting optimized parameters P, I, and D. The PID parameters dynamically adapt to the changes in the system state. To verify the feasibility of the control strategy and the advantages of the control performance, the control strategy was compared with the traditional fixed-parameter PID control method on the MATLAB/Simulink simulation platform. The simulation results demonstrate that the P and I parameters in the variable-parameter PID control strategy significantly change when the system receives an active power adjustment instruction, resulting in corresponding adjustments in the output torque reference values. As a result, the overshoot and fluctuation of the system power output are reduced, and the dynamic response performance is improved.

Key words: flywheel energy storage, electromagnetic coupler, reinforcement learning, vector control, PID control

中图分类号: