首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Multi-agent DRL-based data-driven approach for PEVs charging/discharging scheduling in smart grid
Institution:1. Department of Automation, University of Science and Technology of China, Hefei, 230027, China;2. College of Electrical and Information Engineering, Hunan University, Changsha, 410082, China;1. School of Computer Science and Technology, Huaiyin Normal University, Huaian 223300, Jiangsu, China;2. School of Mathematics, Southeast University, Nanjing 210096, China;3. Yonsei Frontier Lab, Yonsei University, Seoul 03722, South Korea;4. School of Mathematical Science, Huaiyin Normal University, Huaian 223300, Jiangsu, China;1. AnHui Province Key Laboratory of Special Heavy Load Robot, Anhui University of Technology, Ma’anshan 243002, PR China;2. School of Automation and Electrical Engineering, Linyi University, Linyi 276005, PR China;3. School of Information Science and Engineering, Chengdu University, Chengdu 610106, PR China
Abstract:This paper studies the charging/discharging scheduling problem of plug-in electric vehicles (PEVs) in smart grid, considering the users’ satisfaction with state of charge (SoC) and the degradation cost of batteries. The objective is to collectively determine the energy usage patterns of all participating PEVs so as to minimize the energy cost of all PEVs while ensuring the charging needs of PEV owners. The challenges herein are mainly in three folds: 1) the randomness of electricity price and PEVs’ commuting behavior; 2) the unknown dynamics model of SoC; and 3) a large solution space, which make it challenging to directly develop a model-based optimization algorithm. To this end, we first reformulate the above energy cost minimization problem as a Markov game with unknown transition probabilities. Then a multi-agent deep reinforcement learning (DRL)-based data-driven approach is developed to solve the Markov game. Specifically, the proposed approach consists of two networks: an extreme learning machine (ELM)-based feedforward neural network (NN) for uncertainty prediction of electricity price and PEVs’ commuting behavior and a Q network for optimal action-value function approximation. Finally, the comparison results with three benchmark solutions show that our proposed algorithm can not only adaptively decide the optimal charging/discharging policy by on-line learning process, but also yield a lower energy cost within an unknown market environment.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号