首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Federated reinforcement learning approach for detecting uncertain deceptive target using autonomous dual UAV system
Institution:1. Department of Management Science and Engineering, School of Economics and Management, University of Science and Technology Beijing, Beijing 100083, China;2. School of Management and Engineering, Capital University of Economics and Business, Beijing 100070, China;1. Post-Doctoral Research Center, China Central Depository & Clearing Co., Ltd., Beijing, China;2. School of Software, Tsinghua University, Beijing, China;3. School of Information, Renmin University of China, Beijing, China;1. Faculty of Social and Political Sciences, Institute of Social Sciences, Life Course and Social Inequality Research Centre, Lausanne University, Switzerland;2. Institute of Computational Linguistics, Zurich University, Switzerland;1. School of Digital Economy & Trade, Wenzhou Polytechnic, China;2. School of Finance, Shanghai University of Finance and Economics, China;3. Department of Aviation Services and Management, China University of Science and Technology, Taipei, Taiwan;4. School of Accounting, Guizhou University of Finance and Economics, Guiyang 550025, China;5. School of Public Economics and Administration, Shanghai University of Finance and Economics, China
Abstract:This paper develops a cooperative federated reinforcement learning (RL) strategy that enables two unmanned aerial vehicles (UAVs) to cooperate in learning and predicting the movements of an intelligent deceptive target in a given search area. The proposed strategy allows the UAVs to autonomously cooperate, through information exchange of the gained experience to maximize the target detection performance and accelerate the learning speed while maintaining privacy. Specifically, we consider a monitoring model that includes a search area, a charging station, two cooperative UAVs, an intelligent deceptive uncertain moving target, and a fake (false) target. Each UAV is equipped with a limited-capacity rechargeable battery and a communication unit for exchanging the gained experience. The problem of maximizing the detection probability of the uncertain deceptive target using cooperative UAVs is mathematically modeled as a search-benefit maximization problem, which is then reformulated as a Markov decision process (MDP) due to the uncertainty nature of the problem. Because there is no prior information on the targets’ movement, a cooperative RL, is utilized to tackle the problem. The proposed cooperative RL-based algorithm is a distributed collaborative mechanism that enables the two UAVs, i.e., agents, to individually interact with the operating environment and maximize their cumulative rewards by converging to a shared policy while achieving privacy. Simulation results indicate that a cooperative RL-based dual UAV system can noticeably improve the target detection probability, reduce the detection performance, and accelerate the learning speed.
Keywords:Cooperative learning  Federated learning  Artificial intelligence  Emerging UAV  Indoor environment
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号