TY - GEN
T1 - Resource Allocation for Multi-target Radar Tracking via Constrained Deep Reinforcement Learning
AU - Lu, Ziyang
AU - Gursoy, M. Cenk
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - In this work, we propose a constrained deep reinforcement learning (CDRL) based approach to address resource allocation for multi-target tracking in a radar system. In the proposed CDRL algorithm, both the parameters of the deep Q-network (DQN) and the dual variable are learned simultaneously. The proposed CDRL framework consists of two components, namely online CDRL and offline CDRL. Training a DQN in the deep reinforcement learning algorithm usually requires a large amount of data, which may not be available in a target tracking task due to the scarcity of measurements. We address this challenge by proposing an offline CDRL framework, in which the algorithm evolves in a virtual environment generated based on the current observations and prior knowledge of the environment. Simulation results show that both offline CDRL and online CDRL are critical. Offline CDRL provides more training data to stabilize the learning process and the online component can sense the change in the environment and make the corresponding adaptation.
AB - In this work, we propose a constrained deep reinforcement learning (CDRL) based approach to address resource allocation for multi-target tracking in a radar system. In the proposed CDRL algorithm, both the parameters of the deep Q-network (DQN) and the dual variable are learned simultaneously. The proposed CDRL framework consists of two components, namely online CDRL and offline CDRL. Training a DQN in the deep reinforcement learning algorithm usually requires a large amount of data, which may not be available in a target tracking task due to the scarcity of measurements. We address this challenge by proposing an offline CDRL framework, in which the algorithm evolves in a virtual environment generated based on the current observations and prior knowledge of the environment. Simulation results show that both offline CDRL and online CDRL are critical. Offline CDRL provides more training data to stabilize the learning process and the online component can sense the change in the environment and make the corresponding adaptation.
UR - http://www.scopus.com/inward/record.url?scp=85178253228&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85178253228&partnerID=8YFLogxK
U2 - 10.1109/PIMRC56721.2023.10293804
DO - 10.1109/PIMRC56721.2023.10293804
M3 - Conference contribution
AN - SCOPUS:85178253228
T3 - IEEE International Symposium on Personal, Indoor and Mobile Radio Communications, PIMRC
BT - 2023 IEEE 34th Annual International Symposium on Personal, Indoor and Mobile Radio Communications
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 34th IEEE Annual International Symposium on Personal, Indoor and Mobile Radio Communications, PIMRC 2023
Y2 - 5 September 2023 through 8 September 2023
ER -