Abstract
We propose a new approach for outlier detection, based on a ranking measure that focuses on the question of whether a point is 'central' for its nearest neighbours. Using our notations, a low cumulative rank implies that the point is central. For instance, a point centrally located in a cluster has a relatively low cumulative sum of ranks because it is among the nearest neighbours of its own nearest neighbours, but a point at the periphery of a cluster has a high cumulative sum of ranks because its nearest neighbours are closer to each other than the point. Use of ranks eliminates the problem of density calculation in the neighbourhood of the point and this improves the performance. Our method performs better than several density-based methods on some synthetic data sets as well as on some real data sets.
Original language | English (US) |
---|---|
Pages (from-to) | 518-531 |
Number of pages | 14 |
Journal | Journal of Statistical Computation and Simulation |
Volume | 83 |
Issue number | 3 |
DOIs | |
State | Published - Mar 2013 |
Keywords
- neighbourhood sets
- outlier detection
- ranking
ASJC Scopus subject areas
- Statistics and Probability
- Modeling and Simulation
- Statistics, Probability and Uncertainty
- Applied Mathematics