Strangeness-based feature weighting and classification of gene expression profiles

Shao Haifeng, Yu Bei, Joseph Nadeau

Research output: Chapter in Book/Entry/PoemConference contribution

1 Scopus citations


Achieving high classification accuracy is a major challenge in the diagnosis of cancer types based on gene expression profiles. These profiles are notoriously noisy in that a large number of genes might be irrelevant to or weakly associated with disease phenotypes such as tumors. Assigning different weights to genes could decrease or diminish the influences of those "noisy" signals, and thereby improve classification accuracy. We propose an intuitive and simple approach to cancer classification with feature weighting. Our strangeness-based feature weighting method learns weights for different genes based on their classification performance. Those genes with large weights can be used as discriminative genes. We demonstrate that our implementation of k-NN classifier achieved high classification accuracy on two benchmark cancer data sets. In the case of relatively low accuracy, the proposed method could be used as a feature filter. With combined feature weighting and AdaBoost, we achieved a better classification accuracy (100%) than using strangeness-based k-NN alone.

Original languageEnglish (US)
Title of host publicationProceedings of the 23rd Annual ACM Symposium on Applied Computing, SAC'08
Number of pages5
StatePublished - 2008
Externally publishedYes
Event23rd Annual ACM Symposium on Applied Computing, SAC'08 - Fortaleza, Ceara, Brazil
Duration: Mar 16 2008Mar 20 2008

Publication series

NameProceedings of the ACM Symposium on Applied Computing


Other23rd Annual ACM Symposium on Applied Computing, SAC'08
CityFortaleza, Ceara


  • Cancer classification
  • Feature weighting
  • Gene expression
  • Strangeness

ASJC Scopus subject areas

  • Software


Dive into the research topics of 'Strangeness-based feature weighting and classification of gene expression profiles'. Together they form a unique fingerprint.

Cite this