Clustering of trend data using joinpoint regression models

Heather Hyune-Ju Kim, Jun Luo, Jeankyung Kim, Huann Sheng Chen, Eric J. Feuer

Research output: Contribution to journalArticle

9 Scopus citations

Abstract

In this paper, we propose methods to cluster groups of two-dimensional data whose mean functions are piecewise linear into several clusters with common characteristics such as the same slopes. To fit segmented line regression models with common features for each possible cluster, we use a restricted least squares method. In implementing the restricted least squares method, we estimate the maximum number of segments in each cluster by using both the permutation test method and the Bayes information criterion method and then propose to use the Bayes information criterion to determine the number of clusters. For a more effective implementation of the clustering algorithm, we propose a measure of the minimum distance worth detecting and illustrate its use in two examples. We summarize simulation results to study properties of the proposed methods and also prove the consistency of the cluster grouping estimated with a given number of clusters. The presentation and examples in this paper focus on the segmented line regression model with the ordered values of the independent variable, which has been the model of interest in cancer trend analysis, but the proposed method can be applied to a general model with design points either ordered or unordered.

Original languageEnglish (US)
Pages (from-to)4087-4103
Number of pages17
JournalStatistics in Medicine
Volume33
Issue number23
DOIs
StatePublished - 2014

    Fingerprint

Keywords

  • Bayes information criterion
  • clustering
  • joinpoint regression
  • minimum distance worth detecting
  • permutation test

ASJC Scopus subject areas

  • Epidemiology
  • Statistics and Probability

Cite this