TY - GEN
T1 - Anomaly Detection under Controlled Sensing Using Actor-Critic Reinforcement Learning
AU - Joseph, Geethu
AU - Gursoy, M. Cenk
AU - Varshney, Pramod K.
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/5
Y1 - 2020/5
N2 - We consider the problem of detecting anomalies among a given set of processes using their noisy binary sensor measurements. The noiseless sensor measurement corresponding to a normal process is 0, and the measurement is 1 if the process is anomalous. The decision-making algorithm is assumed to have no knowledge of the number of anomalous processes. The algorithm is allowed to choose a subset of the sensors at each time instant until the confidence level on the decision exceeds the desired value. Our objective is to design a sequential sensor selection policy that dynamically determines which processes to observe at each time and when to terminate the detection algorithm. The selection policy is designed such that the anomalous processes are detected with the desired confidence level while incurring minimum cost which comprises the delay in detection and the cost of sensing. We cast this problem as a sequential hypothesis testing problem within the framework of Markov decision processes, and solve it using the actor-critic deep reinforcement learning algorithm. This deep neural network-based algorithm offers a low complexity solution with good detection accuracy. We also study the effect of statistical dependence between the processes on the algorithm performance. Through numerical experiments, we show that our algorithm is able to adapt to any unknown statistical dependence pattern of the processes.
AB - We consider the problem of detecting anomalies among a given set of processes using their noisy binary sensor measurements. The noiseless sensor measurement corresponding to a normal process is 0, and the measurement is 1 if the process is anomalous. The decision-making algorithm is assumed to have no knowledge of the number of anomalous processes. The algorithm is allowed to choose a subset of the sensors at each time instant until the confidence level on the decision exceeds the desired value. Our objective is to design a sequential sensor selection policy that dynamically determines which processes to observe at each time and when to terminate the detection algorithm. The selection policy is designed such that the anomalous processes are detected with the desired confidence level while incurring minimum cost which comprises the delay in detection and the cost of sensing. We cast this problem as a sequential hypothesis testing problem within the framework of Markov decision processes, and solve it using the actor-critic deep reinforcement learning algorithm. This deep neural network-based algorithm offers a low complexity solution with good detection accuracy. We also study the effect of statistical dependence between the processes on the algorithm performance. Through numerical experiments, we show that our algorithm is able to adapt to any unknown statistical dependence pattern of the processes.
KW - Active hypothesis testing
KW - optimal sequential selection
KW - quickest state estimation
KW - reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85090393819&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85090393819&partnerID=8YFLogxK
U2 - 10.1109/SPAWC48557.2020.9154275
DO - 10.1109/SPAWC48557.2020.9154275
M3 - Conference contribution
AN - SCOPUS:85090393819
T3 - IEEE Workshop on Signal Processing Advances in Wireless Communications, SPAWC
BT - 2020 IEEE 21st International Workshop on Signal Processing Advances in Wireless Communications, SPAWC 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 21st IEEE International Workshop on Signal Processing Advances in Wireless Communications, SPAWC 2020
Y2 - 26 May 2020 through 29 May 2020
ER -