Predictive value and discriminant capacity of cepstral- and spectral-based measures during continuous speech

Soren Lowell, Raymond H. Colton, Richard T. Kelley, Sarah A. Mizia

Research output: Contribution to journalArticle

28 Citations (Scopus)

Abstract

Objectives/Hypothesis: The purpose of this study was to determine the relative strength of various cepstral- and spectral-based measures for predicting dysphonia severity and differentiating voice quality types. Study Design: Prospective, quasi-experimental research design. Methods: Twenty-eight dysphonic speakers and 14 normal speakers were included in this study. Among the dysphonic speakers, 14 had a predominant voice quality of breathiness and 14 had a predominant voice quality of roughness. Cepstral and spectral analyses of the first and second sentences of the Rainbow passage were performed, along with perceptual ratings of overall dysphonia severity. Linear regression was performed to determine the predictive capacity of each variable for dysphonia severity, and discriminant analysis determined the combination of variables that optimally differentiated the three voice quality types. Results: A four-factor model that incorporated the cepstral- and spectral-based measures produced an R value of 0.899, explaining 81% of the variance in auditory-perceptual dysphonia severity. Cepstral peak prominence (CPP) showed the greatest predictive contribution to dysphonia severity in the regression model. The discriminant analysis produced two discriminant functions that included both CPP and its standard deviation (CPP SD) as significant contributors (P < 0.001), with an overall classification accuracy for the combined functions of 79%. Conclusions: Acoustic measures reflecting the distribution of harmonic energy and low- to high-frequency energy in continuous speech, along with the variability (standard deviations) of each, were highly predictive of dysphonia severity when combined in a multivariate linear model. Cepstral-based measures showed the highest capacity to discriminate voice quality types, with better classification accuracy for normal and dysphonic-breathy than for dysphonic-rough voices.

Original languageEnglish (US)
Pages (from-to)393-400
Number of pages8
JournalJournal of Voice
Volume27
Issue number4
DOIs
StatePublished - Jul 2013

Fingerprint

Dysphonia
Voice Quality
Discriminant Analysis
Linear Models
Research Design
Acoustics
Prospective Studies

Keywords

  • Acoustic
  • Cepstral
  • Cepstral peak prominence
  • Cepstrum
  • Dysphonia
  • Low-high frequency
  • Spectral
  • Spectrum
  • Voice
  • Voice disorder

ASJC Scopus subject areas

  • Otorhinolaryngology
  • Speech and Hearing
  • LPN and LVN

Cite this

Predictive value and discriminant capacity of cepstral- and spectral-based measures during continuous speech. / Lowell, Soren; Colton, Raymond H.; Kelley, Richard T.; Mizia, Sarah A.

In: Journal of Voice, Vol. 27, No. 4, 07.2013, p. 393-400.

Research output: Contribution to journalArticle

Lowell, Soren ; Colton, Raymond H. ; Kelley, Richard T. ; Mizia, Sarah A. / Predictive value and discriminant capacity of cepstral- and spectral-based measures during continuous speech. In: Journal of Voice. 2013 ; Vol. 27, No. 4. pp. 393-400.
@article{ff28cae8afc54befb2c13e76f99a5b9f,
title = "Predictive value and discriminant capacity of cepstral- and spectral-based measures during continuous speech",
abstract = "Objectives/Hypothesis: The purpose of this study was to determine the relative strength of various cepstral- and spectral-based measures for predicting dysphonia severity and differentiating voice quality types. Study Design: Prospective, quasi-experimental research design. Methods: Twenty-eight dysphonic speakers and 14 normal speakers were included in this study. Among the dysphonic speakers, 14 had a predominant voice quality of breathiness and 14 had a predominant voice quality of roughness. Cepstral and spectral analyses of the first and second sentences of the Rainbow passage were performed, along with perceptual ratings of overall dysphonia severity. Linear regression was performed to determine the predictive capacity of each variable for dysphonia severity, and discriminant analysis determined the combination of variables that optimally differentiated the three voice quality types. Results: A four-factor model that incorporated the cepstral- and spectral-based measures produced an R value of 0.899, explaining 81{\%} of the variance in auditory-perceptual dysphonia severity. Cepstral peak prominence (CPP) showed the greatest predictive contribution to dysphonia severity in the regression model. The discriminant analysis produced two discriminant functions that included both CPP and its standard deviation (CPP SD) as significant contributors (P < 0.001), with an overall classification accuracy for the combined functions of 79{\%}. Conclusions: Acoustic measures reflecting the distribution of harmonic energy and low- to high-frequency energy in continuous speech, along with the variability (standard deviations) of each, were highly predictive of dysphonia severity when combined in a multivariate linear model. Cepstral-based measures showed the highest capacity to discriminate voice quality types, with better classification accuracy for normal and dysphonic-breathy than for dysphonic-rough voices.",
keywords = "Acoustic, Cepstral, Cepstral peak prominence, Cepstrum, Dysphonia, Low-high frequency, Spectral, Spectrum, Voice, Voice disorder",
author = "Soren Lowell and Colton, {Raymond H.} and Kelley, {Richard T.} and Mizia, {Sarah A.}",
year = "2013",
month = "7",
doi = "10.1016/j.jvoice.2013.02.005",
language = "English (US)",
volume = "27",
pages = "393--400",
journal = "Journal of Voice",
issn = "0892-1997",
publisher = "Mosby Inc.",
number = "4",

}

TY - JOUR

T1 - Predictive value and discriminant capacity of cepstral- and spectral-based measures during continuous speech

AU - Lowell, Soren

AU - Colton, Raymond H.

AU - Kelley, Richard T.

AU - Mizia, Sarah A.

PY - 2013/7

Y1 - 2013/7

N2 - Objectives/Hypothesis: The purpose of this study was to determine the relative strength of various cepstral- and spectral-based measures for predicting dysphonia severity and differentiating voice quality types. Study Design: Prospective, quasi-experimental research design. Methods: Twenty-eight dysphonic speakers and 14 normal speakers were included in this study. Among the dysphonic speakers, 14 had a predominant voice quality of breathiness and 14 had a predominant voice quality of roughness. Cepstral and spectral analyses of the first and second sentences of the Rainbow passage were performed, along with perceptual ratings of overall dysphonia severity. Linear regression was performed to determine the predictive capacity of each variable for dysphonia severity, and discriminant analysis determined the combination of variables that optimally differentiated the three voice quality types. Results: A four-factor model that incorporated the cepstral- and spectral-based measures produced an R value of 0.899, explaining 81% of the variance in auditory-perceptual dysphonia severity. Cepstral peak prominence (CPP) showed the greatest predictive contribution to dysphonia severity in the regression model. The discriminant analysis produced two discriminant functions that included both CPP and its standard deviation (CPP SD) as significant contributors (P < 0.001), with an overall classification accuracy for the combined functions of 79%. Conclusions: Acoustic measures reflecting the distribution of harmonic energy and low- to high-frequency energy in continuous speech, along with the variability (standard deviations) of each, were highly predictive of dysphonia severity when combined in a multivariate linear model. Cepstral-based measures showed the highest capacity to discriminate voice quality types, with better classification accuracy for normal and dysphonic-breathy than for dysphonic-rough voices.

AB - Objectives/Hypothesis: The purpose of this study was to determine the relative strength of various cepstral- and spectral-based measures for predicting dysphonia severity and differentiating voice quality types. Study Design: Prospective, quasi-experimental research design. Methods: Twenty-eight dysphonic speakers and 14 normal speakers were included in this study. Among the dysphonic speakers, 14 had a predominant voice quality of breathiness and 14 had a predominant voice quality of roughness. Cepstral and spectral analyses of the first and second sentences of the Rainbow passage were performed, along with perceptual ratings of overall dysphonia severity. Linear regression was performed to determine the predictive capacity of each variable for dysphonia severity, and discriminant analysis determined the combination of variables that optimally differentiated the three voice quality types. Results: A four-factor model that incorporated the cepstral- and spectral-based measures produced an R value of 0.899, explaining 81% of the variance in auditory-perceptual dysphonia severity. Cepstral peak prominence (CPP) showed the greatest predictive contribution to dysphonia severity in the regression model. The discriminant analysis produced two discriminant functions that included both CPP and its standard deviation (CPP SD) as significant contributors (P < 0.001), with an overall classification accuracy for the combined functions of 79%. Conclusions: Acoustic measures reflecting the distribution of harmonic energy and low- to high-frequency energy in continuous speech, along with the variability (standard deviations) of each, were highly predictive of dysphonia severity when combined in a multivariate linear model. Cepstral-based measures showed the highest capacity to discriminate voice quality types, with better classification accuracy for normal and dysphonic-breathy than for dysphonic-rough voices.

KW - Acoustic

KW - Cepstral

KW - Cepstral peak prominence

KW - Cepstrum

KW - Dysphonia

KW - Low-high frequency

KW - Spectral

KW - Spectrum

KW - Voice

KW - Voice disorder

UR - http://www.scopus.com/inward/record.url?scp=84879800016&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84879800016&partnerID=8YFLogxK

U2 - 10.1016/j.jvoice.2013.02.005

DO - 10.1016/j.jvoice.2013.02.005

M3 - Article

C2 - 23684735

AN - SCOPUS:84879800016

VL - 27

SP - 393

EP - 400

JO - Journal of Voice

JF - Journal of Voice

SN - 0892-1997

IS - 4

ER -