TY - GEN
T1 - Collision-Aware UAV Trajectories for Data Collection via Reinforcement Learning
AU - Wang, Xueyuan
AU - Gursoy, M. Cenk
AU - Erpek, Tugba
AU - Sagduyu, Yalin E.
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Unmanned aerial vehicles (UAVs) are expected to be an integral part of wireless networks, and determining collision-free trajectories in multi-UAV non-cooperative scenarios is a challenging task. In this paper, we consider a path planning optimization problem to maximize the collected data from multiple Internet of Things (IoT) nodes under realistic constraints. The considered multi-UAV non-cooperative scenarios involve random number of other UAVs in addition to the typical UAV, and UAVs do not communicate with each other. We translate the problem into an Markov decision process (MDP). Dueling double deep Q-network (D3QN) is proposed to learn the decision making policy for the typical UAV, without any prior knowledge of the environment (e.g., channel propagation model and locations of the obstacles) and other UAVs (e.g., their missions, movements, and policies). Numerical results demonstrate that real-time navigation can be efficiently performed with high success rate, high data collection rate, and low collision rate.
AB - Unmanned aerial vehicles (UAVs) are expected to be an integral part of wireless networks, and determining collision-free trajectories in multi-UAV non-cooperative scenarios is a challenging task. In this paper, we consider a path planning optimization problem to maximize the collected data from multiple Internet of Things (IoT) nodes under realistic constraints. The considered multi-UAV non-cooperative scenarios involve random number of other UAVs in addition to the typical UAV, and UAVs do not communicate with each other. We translate the problem into an Markov decision process (MDP). Dueling double deep Q-network (D3QN) is proposed to learn the decision making policy for the typical UAV, without any prior knowledge of the environment (e.g., channel propagation model and locations of the obstacles) and other UAVs (e.g., their missions, movements, and policies). Numerical results demonstrate that real-time navigation can be efficiently performed with high success rate, high data collection rate, and low collision rate.
KW - Data collection
KW - collision avoidance
KW - deep reinforcement learning
KW - multi-UAV scenarios
KW - path planning
UR - http://www.scopus.com/inward/record.url?scp=85127272055&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85127272055&partnerID=8YFLogxK
U2 - 10.1109/GLOBECOM46510.2021.9686015
DO - 10.1109/GLOBECOM46510.2021.9686015
M3 - Conference contribution
AN - SCOPUS:85127272055
T3 - 2021 IEEE Global Communications Conference, GLOBECOM 2021 - Proceedings
BT - 2021 IEEE Global Communications Conference, GLOBECOM 2021 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE Global Communications Conference, GLOBECOM 2021
Y2 - 7 December 2021 through 11 December 2021
ER -