Evaluating acoustic representations and normalization for rhoticity classification in children with speech sound disorders

Nina R. Benway, Jonathan L. Preston, Asif Salekin, Elaine Hitchcock, Tara McAllister

Research output: Contribution to journalArticlepeer-review

Abstract

The effects of different acoustic representations and normalizations were compared for classifiers predicting perception of children's rhotic versus derhotic /ɹ/. Formant and Mel frequency cepstral coefficient (MFCC) representations for 350 speakers were z-standardized, either relative to values in the same utterance or age-and-sex data for typical /ɹ/. Statistical modeling indicated age-and-sex normalization significantly increased classifier performances. Clinically interpretable formants performed similarly to MFCCs and were endorsed for deep neural network engineering, achieving mean test-participant-specific F1-score = 0.81 after personalization and replication (σx = 0.10, med = 0.83, n = 48). Shapley additive explanations analysis indicated the third formant most influenced fully rhotic predictions.

Original languageEnglish (US)
Article number025201
JournalJASA Express Letters
Volume4
Issue number2
DOIs
StatePublished - Feb 1 2024

ASJC Scopus subject areas

  • Acoustics and Ultrasonics
  • Music
  • Arts and Humanities (miscellaneous)

Fingerprint

Dive into the research topics of 'Evaluating acoustic representations and normalization for rhoticity classification in children with speech sound disorders'. Together they form a unique fingerprint.

Cite this