Exponentially Consistent K-Means Clustering Algorithm Based on Kolmogrov-Smirnov Test

Tiexing Wang, Donald J. Bucci, Yingbin Liang, Biao Chen, Pramod Kumar Varshney

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

This paper studies clustering using a Kolmogorov-Smirnov based K-means algorithm. All data sequences are assumed to be generated by unknown continuous distributions. The pairwise KS distances of the distributions are assumed to be lower bounded by a certain positive constant. The convergence analysis of the proposed algorithms and upper bounds on the error probability are provided for both known and unknown number of clusters. More importantly, it is shown that the probability of error decays exponentially as the sample size of each data sequence goes to infinity, and the error exponent is only a function of the pairwise KS distances of the distributions. the analysis is validated by simulation results.

Original languageEnglish (US)
Title of host publication2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2296-2300
Number of pages5
Volume2018-April
ISBN (Print)9781538646588
DOIs
StatePublished - Sep 10 2018
Event2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Calgary, Canada
Duration: Apr 15 2018Apr 20 2018

Other

Other2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018
CountryCanada
CityCalgary
Period4/15/184/20/18

Fingerprint

Clustering algorithms
Error probability

Keywords

  • Clustering
  • Exponential consistency
  • K-means algorithm
  • Kolmogorov-Smirnov distance
  • Probability of error

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Wang, T., Bucci, D. J., Liang, Y., Chen, B., & Varshney, P. K. (2018). Exponentially Consistent K-Means Clustering Algorithm Based on Kolmogrov-Smirnov Test. In 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings (Vol. 2018-April, pp. 2296-2300). [8461730] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2018.8461730

Exponentially Consistent K-Means Clustering Algorithm Based on Kolmogrov-Smirnov Test. / Wang, Tiexing; Bucci, Donald J.; Liang, Yingbin; Chen, Biao; Varshney, Pramod Kumar.

2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings. Vol. 2018-April Institute of Electrical and Electronics Engineers Inc., 2018. p. 2296-2300 8461730.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Wang, T, Bucci, DJ, Liang, Y, Chen, B & Varshney, PK 2018, Exponentially Consistent K-Means Clustering Algorithm Based on Kolmogrov-Smirnov Test. in 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings. vol. 2018-April, 8461730, Institute of Electrical and Electronics Engineers Inc., pp. 2296-2300, 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018, Calgary, Canada, 4/15/18. https://doi.org/10.1109/ICASSP.2018.8461730
Wang T, Bucci DJ, Liang Y, Chen B, Varshney PK. Exponentially Consistent K-Means Clustering Algorithm Based on Kolmogrov-Smirnov Test. In 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings. Vol. 2018-April. Institute of Electrical and Electronics Engineers Inc. 2018. p. 2296-2300. 8461730 https://doi.org/10.1109/ICASSP.2018.8461730
Wang, Tiexing ; Bucci, Donald J. ; Liang, Yingbin ; Chen, Biao ; Varshney, Pramod Kumar. / Exponentially Consistent K-Means Clustering Algorithm Based on Kolmogrov-Smirnov Test. 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings. Vol. 2018-April Institute of Electrical and Electronics Engineers Inc., 2018. pp. 2296-2300
@inproceedings{0aafa2cf7f49489596d0add9c78301ac,
title = "Exponentially Consistent K-Means Clustering Algorithm Based on Kolmogrov-Smirnov Test",
abstract = "This paper studies clustering using a Kolmogorov-Smirnov based K-means algorithm. All data sequences are assumed to be generated by unknown continuous distributions. The pairwise KS distances of the distributions are assumed to be lower bounded by a certain positive constant. The convergence analysis of the proposed algorithms and upper bounds on the error probability are provided for both known and unknown number of clusters. More importantly, it is shown that the probability of error decays exponentially as the sample size of each data sequence goes to infinity, and the error exponent is only a function of the pairwise KS distances of the distributions. the analysis is validated by simulation results.",
keywords = "Clustering, Exponential consistency, K-means algorithm, Kolmogorov-Smirnov distance, Probability of error",
author = "Tiexing Wang and Bucci, {Donald J.} and Yingbin Liang and Biao Chen and Varshney, {Pramod Kumar}",
year = "2018",
month = "9",
day = "10",
doi = "10.1109/ICASSP.2018.8461730",
language = "English (US)",
isbn = "9781538646588",
volume = "2018-April",
pages = "2296--2300",
booktitle = "2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Exponentially Consistent K-Means Clustering Algorithm Based on Kolmogrov-Smirnov Test

AU - Wang, Tiexing

AU - Bucci, Donald J.

AU - Liang, Yingbin

AU - Chen, Biao

AU - Varshney, Pramod Kumar

PY - 2018/9/10

Y1 - 2018/9/10

N2 - This paper studies clustering using a Kolmogorov-Smirnov based K-means algorithm. All data sequences are assumed to be generated by unknown continuous distributions. The pairwise KS distances of the distributions are assumed to be lower bounded by a certain positive constant. The convergence analysis of the proposed algorithms and upper bounds on the error probability are provided for both known and unknown number of clusters. More importantly, it is shown that the probability of error decays exponentially as the sample size of each data sequence goes to infinity, and the error exponent is only a function of the pairwise KS distances of the distributions. the analysis is validated by simulation results.

AB - This paper studies clustering using a Kolmogorov-Smirnov based K-means algorithm. All data sequences are assumed to be generated by unknown continuous distributions. The pairwise KS distances of the distributions are assumed to be lower bounded by a certain positive constant. The convergence analysis of the proposed algorithms and upper bounds on the error probability are provided for both known and unknown number of clusters. More importantly, it is shown that the probability of error decays exponentially as the sample size of each data sequence goes to infinity, and the error exponent is only a function of the pairwise KS distances of the distributions. the analysis is validated by simulation results.

KW - Clustering

KW - Exponential consistency

KW - K-means algorithm

KW - Kolmogorov-Smirnov distance

KW - Probability of error

UR - http://www.scopus.com/inward/record.url?scp=85048524455&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85048524455&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2018.8461730

DO - 10.1109/ICASSP.2018.8461730

M3 - Conference contribution

SN - 9781538646588

VL - 2018-April

SP - 2296

EP - 2300

BT - 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -