储能科学与技术

• 储能XXXX •    

基于强化学习的变参数PID的惯量飞轮有功控制策略

萨仁高娃1(), 邬超慧1, 倪泽龙2(), 张悦1, 姜新建2, 田建宇2   

  1. 1.内蒙古电力(集团)有限责任公司内蒙古电力经济技术研究院分公司,内蒙古自治区 呼和浩特市 010000
    2.清华大学,北京市 100084
  • 收稿日期:2024-11-27 修回日期:2024-12-02 出版日期:2024-12-20
  • 通讯作者: 倪泽龙 E-mail:1764861384@qq.com;nizl22@mails.tsinghua.edu.cn
  • 作者简介:萨仁高娃(1976—),女,学士,高级工程师,电力系统、机电一体化,E-mail:1764861384@qq.com
  • 基金资助:
    内蒙古电力集团(有限)责任公司科技项目资助(2024-4-59)

A variable-parameter PID active power control strategy of inertia flywheel based on reinforcement learning

RenGaoWa SA1(), ChaoHui Wu1, ZeLong NI2(), Yue ZHANG1, XinJian JIANG2, JianYu TIAN2   

  1. 1.Inner Mongolia Electric Power (Group) Co. , Ltd. Inner Mongolia Electric Power Economic and Technological Research Institute Branch, Hohhot 010000, Inner Mongolia Autonomous Region, China
    2.Tsinghua University, Beijing 100084, China
  • Received:2024-11-27 Revised:2024-12-02 Online:2024-12-20
  • Contact: ZeLong NI E-mail:1764861384@qq.com;nizl22@mails.tsinghua.edu.cn

摘要:

本文对基于电磁耦合器的惯量飞轮系统的进行了研究,首先介绍了该惯量飞轮系统的拓扑结构、原理,并说明了采用电磁耦合器的优势,然后对惯量飞轮系统进行数学建模。由于传统的定参数PID控制方式在系统有功指令突变的时候,输出功率会发生较大的波动,因此本文提出了一种基于强化学习的变参数PID的有功控制策略,在该控制策略中,PID参数是通过无模型参考的强化学习算法训练的神经网络RL Agent得到的,神经网络的输入量是有功功率的偏差、有功功率的微分、转速、转速的微分,输出量是P、I、D三个参数,当系统状态发生变化的时候,PID参数也会随之改变。为了验证该控制策略的可行性与控制性能的优势,在MATLAB/Simulink仿真平台上对该控制策略进行了与传统的定参数PID控制方式的对比验证,仿真结果表明,变参数PID控制策略中的P、I参数在系统收到有功功率调节指令时都有了明显的变化,导致了输出转矩的参考值发生了改变,从而使得系统功率输出的超调量和波动更小,动态响应性能更好。

关键词: 飞轮储能, 电磁耦合器, 强化学习, 矢量控制, PID控制

Abstract:

In this paper, the inertia flywheel system based on electromagnetic coupler is studied, and the topological structure and principle of the inertia flywheel system is first introduced, and the advantages of using electromagnetic couplers are illustrated, and then the inertia flywheel system is mathematically modeled. Because the output power of the traditional fixed-parameter PID control mode will fluctuate greatly when the active instruction of the system changes suddenly, this paper proposes a variable-parameter PID active power control strategy of inertia flywheel based on reinforcement learning, in which the PID parameters are obtained by the RL Agent of the neural network trained by the reinforcement learning algorithm without model reference, and the input of the neural network is the deviation of the active power, the differentiation of the active power, the rotation speed and the rotation speed, and the output is the three parameters of P, I and D. The PID parameters will also change when the system state changes. In order to verify the feasibility of the control strategy and the advantages of control performance, the control strategy is compared with the traditional fixed-parameter PID control method on the MATLAB/Simulink simulation platform, and the simulation results show that the P and I parameters in the variable-parameter PID control strategy have obvious changes when the system receives the active power adjustment instruction, resulting in the change of the reference value of the output torque.As a result, the overshoot and fluctuation of the system power output are smaller, and the dynamic response performance is better.

Key words: flywheel energy storage, electromagnetic coupler, reinforcement learning, vector control, PID control

中图分类号: