A hybrid approach to increase the informedness of CE-based data using locus-specific thresholding and machine learning

Michael A. Marciano, Victoria R. Williamson, Jonathan D. Adelman

Research output: Contribution to journalArticlepeer-review

10 Scopus citations


The interpretation of genetic profiles require a robust and reliable method to discriminate true allelic information from noise, regardless of the instrumentation or methods used. Traditionally, static peak detection thresholds (analytical thresholds) have been applied to capillary electrophoresis generated data to distinguish the true allelic peaks from noise. While the rigid nature of these thresholds attempts to conservatively account for baseline variability across instrument runs, samples, capillaries, dye-channels, injection times, and voltage, its static nature is unable to adapt, leading to a loss of allelic information that exists below the threshold. The method described herein is able to account for this variability by collectively minimizing the incorrect detection of non-allelic artifacts (false positives) and the threshold-induced dropout of true allelic information (false negatives). This is accomplished by using a dynamic locus and sample specific analytical threshold and a machine learning-derived probabilistic artifact detection model. The system produced an allele detection accuracy of 97.2%, an 11.4% increase from the lowest static threshold (50 RFU), with a low incidence of incorrectly identified artifacts (0.79%). This adaptive method outperformed static thresholds in the retention of allelic information content at minimal cost.

Original languageEnglish (US)
Pages (from-to)26-37
Number of pages12
JournalForensic Science International: Genetics
StatePublished - Jul 2018


  • Analytical threshold
  • Artifact removal
  • Baseline
  • Capillary electrophoresis
  • Deep learning
  • Forensic DNA
  • Machine learning
  • Random forest
  • Support vector machine

ASJC Scopus subject areas

  • Pathology and Forensic Medicine
  • Genetics


Dive into the research topics of 'A hybrid approach to increase the informedness of CE-based data using locus-specific thresholding and machine learning'. Together they form a unique fingerprint.

Cite this