TY - GEN

T1 - Unsupervised nonparametric anomaly detection

T2 - 2014 52nd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2014

AU - Zou, Shaofeng

AU - Liang, Yingbin

AU - Poor, H. Vincent

AU - Shi, Xinghua

N1 - Publisher Copyright:
© 2014 IEEE.

PY - 2014/1/30

Y1 - 2014/1/30

N2 - An anomaly detection problem is investigated, in which s out of n sequences are anomalous and need to be detected. Each sequence consists of m independent and identically distributed (i.i.d.) samples drawn either from a nominal distribution p or from an anomalous distribution q that is distinct from p. Neither p nor q is known a priori. Two scenarios respectively with s known and unknown are studied. Distribution-free tests are constructed based on the metric of the maximum mean discrepancy (MMD). It is shown that if the value of s is known, as n goes to infinity, the number m of samples in each sequence should be of order O(log n) or larger to guarantee that the constructed test is exponentially consistent. On the other hand, if the value of s is unknown, the number m of samples in each sequence should be of the order strictly greater than O(log n) to guarantee the constructed test is consistent. The computational complexity of all tests are shown to be polynomial. Numerical results are provided to confirm the theoretic characterization of the performance. Further numerical results on both synthetic data sets and real data sets demonstrate that the MMD-based tests outperform or perform as well as other approaches.

AB - An anomaly detection problem is investigated, in which s out of n sequences are anomalous and need to be detected. Each sequence consists of m independent and identically distributed (i.i.d.) samples drawn either from a nominal distribution p or from an anomalous distribution q that is distinct from p. Neither p nor q is known a priori. Two scenarios respectively with s known and unknown are studied. Distribution-free tests are constructed based on the metric of the maximum mean discrepancy (MMD). It is shown that if the value of s is known, as n goes to infinity, the number m of samples in each sequence should be of order O(log n) or larger to guarantee that the constructed test is exponentially consistent. On the other hand, if the value of s is unknown, the number m of samples in each sequence should be of the order strictly greater than O(log n) to guarantee the constructed test is consistent. The computational complexity of all tests are shown to be polynomial. Numerical results are provided to confirm the theoretic characterization of the performance. Further numerical results on both synthetic data sets and real data sets demonstrate that the MMD-based tests outperform or perform as well as other approaches.

UR - http://www.scopus.com/inward/record.url?scp=84946685199&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84946685199&partnerID=8YFLogxK

U2 - 10.1109/ALLERTON.2014.7028541

DO - 10.1109/ALLERTON.2014.7028541

M3 - Conference contribution

AN - SCOPUS:84946685199

T3 - 2014 52nd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2014

SP - 836

EP - 841

BT - 2014 52nd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2014

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 30 September 2014 through 3 October 2014

ER -