首页 | 本学科首页   官方微博 | 高级检索  
     检索      


A general motion control framework for an autonomous underwater vehicle through deep reinforcement learning and disturbance observers
Institution:1. College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, Heilongjiang 150001, China;2. Qingdao Innovation and Development Center of Harbin Engineering University, Qingdao, Shandong 266000, China;1. Department of Intelligent Mechatronics Engineering and Convergence Engineering for Intelligent Drone, Sejong University, Seoul 05006, South Korea;2. Department of Mathematics and Algorithms Research, Nokia Bell Labs, Murray Hill NJ 07974, USA;1. Department of Mathematics, Guizhou University, Guiyang, Guizhou 550025, China;2. Supercomputing Algorithm and Application Laboratory of Guizhou University and Gui’an Scientific Innovation Company, Guizhou University, Guiyang, Guizhou 550025, China;3. Department of Mathematical Analysis and Numerical Mathematics, Faculty of Mathematics, Physics and Informatics, Comenius University in Bratislava, Mlynská dolina, 842 48 Bratislava, Slovakia;4. Mathematical Institute, Slovak Academy of Sciences, ?tefánikova 49, 814 73 Bratislava, Slovakia;1. State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, China;2. School of Electrical and Information Engineering, Hunan University of Technology, Zhuzhou 412007, China
Abstract:This paper investigates the application of deep reinforcement learning (RL) in the motion control for an autonomous underwater vehicle (AUV), and proposes a novel general motion control framework which separates training and deployment. Firstly, the state space, action space, and reward function are customized under the condition of ensuring generality for various motion control tasks. Next, in order to efficiently learn the optimal motion control policy in the case that the AUV model is imprecise and there are unknown external disturbances, a virtual AUV model composed of the known and determined items of an actual AUV is put forward and a simulation training method is developed on this basis. Then, in the given deployment method, three independent extended state observers (ESOs) are designed to deal with the unknown items in different directions, and the final controller is obtained by compensating the estimated value of ESOs into the output of the optimal motion control policy obtained through simulation training. Finally, soft actor-critic is chosen as deep RL algorithm of the framework, and the generality and effectiveness of the proposed method are verified in four different AUV motion control tasks.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号