Online anomaly detection using random forest

Zhiruo Zhao, Kishan G. Mehrotra, Chilukuri K Mohan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we focus on how to use random forests based methods to improve the anomaly detection rate for streaming datasets. The key concept in a current work [12] is to build a random forest where in any tree, at any internal node, a feature is randomly selected and the associated data space is partitioned in half. However, the model parameters were pre-defined and the efficiency on applying this model for various conditions is not discussed. In this paper, we first give mathematical justification of required tree height and number of trees by casting the problem as a classical coupon collector problem. Then we design a majority voting score combination strategy to combine the results from different anomaly detection trees. Finally, we apply feature clustering to group the correlated features together in order to find the anomalies jointly determined by subsets of features.

Original languageEnglish (US)
Title of host publicationRecent Trends and Future Technology in Applied Intelligence - 31st International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2018, Proceedings
PublisherSpringer Verlag
Pages135-147
Number of pages13
ISBN (Print)9783319920573
DOIs
StatePublished - Jan 1 2018
Event31st International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems IEA/AIE 2018 - Montreal, Canada
Duration: Jun 25 2018Jun 28 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10868 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other31st International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems IEA/AIE 2018
CountryCanada
CityMontreal
Period6/25/186/28/18

Fingerprint

Random Forest
Anomaly Detection
Casting
Majority Voting
Streaming
Justification
Anomaly
Clustering
Internal
Subset
Vertex of a graph
Model

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Zhao, Z., Mehrotra, K. G., & Mohan, C. K. (2018). Online anomaly detection using random forest. In Recent Trends and Future Technology in Applied Intelligence - 31st International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2018, Proceedings (pp. 135-147). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10868 LNAI). Springer Verlag. https://doi.org/10.1007/978-3-319-92058-0_13

Online anomaly detection using random forest. / Zhao, Zhiruo; Mehrotra, Kishan G.; Mohan, Chilukuri K.

Recent Trends and Future Technology in Applied Intelligence - 31st International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2018, Proceedings. Springer Verlag, 2018. p. 135-147 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10868 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhao, Z, Mehrotra, KG & Mohan, CK 2018, Online anomaly detection using random forest. in Recent Trends and Future Technology in Applied Intelligence - 31st International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2018, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 10868 LNAI, Springer Verlag, pp. 135-147, 31st International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems IEA/AIE 2018, Montreal, Canada, 6/25/18. https://doi.org/10.1007/978-3-319-92058-0_13
Zhao Z, Mehrotra KG, Mohan CK. Online anomaly detection using random forest. In Recent Trends and Future Technology in Applied Intelligence - 31st International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2018, Proceedings. Springer Verlag. 2018. p. 135-147. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-92058-0_13
Zhao, Zhiruo ; Mehrotra, Kishan G. ; Mohan, Chilukuri K. / Online anomaly detection using random forest. Recent Trends and Future Technology in Applied Intelligence - 31st International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2018, Proceedings. Springer Verlag, 2018. pp. 135-147 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{0d9514dca4e14246808d5ea4c87d4277,
title = "Online anomaly detection using random forest",
abstract = "In this paper, we focus on how to use random forests based methods to improve the anomaly detection rate for streaming datasets. The key concept in a current work [12] is to build a random forest where in any tree, at any internal node, a feature is randomly selected and the associated data space is partitioned in half. However, the model parameters were pre-defined and the efficiency on applying this model for various conditions is not discussed. In this paper, we first give mathematical justification of required tree height and number of trees by casting the problem as a classical coupon collector problem. Then we design a majority voting score combination strategy to combine the results from different anomaly detection trees. Finally, we apply feature clustering to group the correlated features together in order to find the anomalies jointly determined by subsets of features.",
author = "Zhiruo Zhao and Mehrotra, {Kishan G.} and Mohan, {Chilukuri K}",
year = "2018",
month = "1",
day = "1",
doi = "10.1007/978-3-319-92058-0_13",
language = "English (US)",
isbn = "9783319920573",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "135--147",
booktitle = "Recent Trends and Future Technology in Applied Intelligence - 31st International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2018, Proceedings",

}

TY - GEN

T1 - Online anomaly detection using random forest

AU - Zhao, Zhiruo

AU - Mehrotra, Kishan G.

AU - Mohan, Chilukuri K

PY - 2018/1/1

Y1 - 2018/1/1

N2 - In this paper, we focus on how to use random forests based methods to improve the anomaly detection rate for streaming datasets. The key concept in a current work [12] is to build a random forest where in any tree, at any internal node, a feature is randomly selected and the associated data space is partitioned in half. However, the model parameters were pre-defined and the efficiency on applying this model for various conditions is not discussed. In this paper, we first give mathematical justification of required tree height and number of trees by casting the problem as a classical coupon collector problem. Then we design a majority voting score combination strategy to combine the results from different anomaly detection trees. Finally, we apply feature clustering to group the correlated features together in order to find the anomalies jointly determined by subsets of features.

AB - In this paper, we focus on how to use random forests based methods to improve the anomaly detection rate for streaming datasets. The key concept in a current work [12] is to build a random forest where in any tree, at any internal node, a feature is randomly selected and the associated data space is partitioned in half. However, the model parameters were pre-defined and the efficiency on applying this model for various conditions is not discussed. In this paper, we first give mathematical justification of required tree height and number of trees by casting the problem as a classical coupon collector problem. Then we design a majority voting score combination strategy to combine the results from different anomaly detection trees. Finally, we apply feature clustering to group the correlated features together in order to find the anomalies jointly determined by subsets of features.

UR - http://www.scopus.com/inward/record.url?scp=85049040399&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85049040399&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-92058-0_13

DO - 10.1007/978-3-319-92058-0_13

M3 - Conference contribution

SN - 9783319920573

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 135

EP - 147

BT - Recent Trends and Future Technology in Applied Intelligence - 31st International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2018, Proceedings

PB - Springer Verlag

ER -