Spectral- and cepstral-based measures during continuous speech: Capacity to distinguish dysphonia and consistency within a speaker

Soren Lowell, Raymond H. Colton, Richard T. Kelley, Youngmee C. Hahn

Research output: Contribution to journalArticle

66 Citations (Scopus)

Abstract

Spectral- and cepstral-based acoustic measures are preferable to time-based measures for accurately representing dysphonic voices during continuous speech. Although these measures show promising relationships to perceptual voice quality ratings, less is known regarding their ability to differentiate normal from dysphonic voice during continuous speech and the consistency of these measures across multiple utterances by the same speaker. The purpose of this study was to determine whether spectral moments of the long-term average spectrum (LTAS) (spectral mean, standard deviation, skewness, and kurtosis) and cepstral peak prominence measures were significantly different for speakers with and without voice disorders when assessed during continuous speech. The consistency of these measures within a speaker across utterances was also addressed. Continuous speech samples from 27 subjects without voice disorders and 27 subjects with mixed voice disorders were acoustically analyzed. In addition, voice samples were perceptually rated for overall severity. Acoustic analyses were performed on three continuous speech stimuli from a reading passage: two full sentences and one constituent phrase. Significant between-group differences were found for both cepstral measures and three LTAS measures (P < 0.001): spectral mean, skewness, and kurtosis. These five measures also showed moderate to strong correlations to overall voice severity. Furthermore, high degrees of within-speaker consistency (correlation coefficients ≥0.89) across utterances with varying length and phonemic content were evidenced for both subject groups.

Original languageEnglish (US)
JournalJournal of Voice
Volume25
Issue number5
DOIs
StatePublished - Sep 2011

Fingerprint

Dysphonia
Voice Disorders
Acoustics
Voice Quality
Aptitude
Reading

Keywords

  • Acoustic
  • Cepstral
  • Dysphonia
  • Long-term average spectrum
  • Spectral
  • Spectral moments
  • Voice

ASJC Scopus subject areas

  • Otorhinolaryngology
  • Speech and Hearing
  • LPN and LVN

Cite this

Spectral- and cepstral-based measures during continuous speech : Capacity to distinguish dysphonia and consistency within a speaker. / Lowell, Soren; Colton, Raymond H.; Kelley, Richard T.; Hahn, Youngmee C.

In: Journal of Voice, Vol. 25, No. 5, 09.2011.

Research output: Contribution to journalArticle

@article{9a6219aea3f9406bbd0e5a4983162e3c,
title = "Spectral- and cepstral-based measures during continuous speech: Capacity to distinguish dysphonia and consistency within a speaker",
abstract = "Spectral- and cepstral-based acoustic measures are preferable to time-based measures for accurately representing dysphonic voices during continuous speech. Although these measures show promising relationships to perceptual voice quality ratings, less is known regarding their ability to differentiate normal from dysphonic voice during continuous speech and the consistency of these measures across multiple utterances by the same speaker. The purpose of this study was to determine whether spectral moments of the long-term average spectrum (LTAS) (spectral mean, standard deviation, skewness, and kurtosis) and cepstral peak prominence measures were significantly different for speakers with and without voice disorders when assessed during continuous speech. The consistency of these measures within a speaker across utterances was also addressed. Continuous speech samples from 27 subjects without voice disorders and 27 subjects with mixed voice disorders were acoustically analyzed. In addition, voice samples were perceptually rated for overall severity. Acoustic analyses were performed on three continuous speech stimuli from a reading passage: two full sentences and one constituent phrase. Significant between-group differences were found for both cepstral measures and three LTAS measures (P < 0.001): spectral mean, skewness, and kurtosis. These five measures also showed moderate to strong correlations to overall voice severity. Furthermore, high degrees of within-speaker consistency (correlation coefficients ≥0.89) across utterances with varying length and phonemic content were evidenced for both subject groups.",
keywords = "Acoustic, Cepstral, Dysphonia, Long-term average spectrum, Spectral, Spectral moments, Voice",
author = "Soren Lowell and Colton, {Raymond H.} and Kelley, {Richard T.} and Hahn, {Youngmee C.}",
year = "2011",
month = "9",
doi = "10.1016/j.jvoice.2010.06.007",
language = "English (US)",
volume = "25",
journal = "Journal of Voice",
issn = "0892-1997",
publisher = "Mosby Inc.",
number = "5",

}

TY - JOUR

T1 - Spectral- and cepstral-based measures during continuous speech

T2 - Capacity to distinguish dysphonia and consistency within a speaker

AU - Lowell, Soren

AU - Colton, Raymond H.

AU - Kelley, Richard T.

AU - Hahn, Youngmee C.

PY - 2011/9

Y1 - 2011/9

N2 - Spectral- and cepstral-based acoustic measures are preferable to time-based measures for accurately representing dysphonic voices during continuous speech. Although these measures show promising relationships to perceptual voice quality ratings, less is known regarding their ability to differentiate normal from dysphonic voice during continuous speech and the consistency of these measures across multiple utterances by the same speaker. The purpose of this study was to determine whether spectral moments of the long-term average spectrum (LTAS) (spectral mean, standard deviation, skewness, and kurtosis) and cepstral peak prominence measures were significantly different for speakers with and without voice disorders when assessed during continuous speech. The consistency of these measures within a speaker across utterances was also addressed. Continuous speech samples from 27 subjects without voice disorders and 27 subjects with mixed voice disorders were acoustically analyzed. In addition, voice samples were perceptually rated for overall severity. Acoustic analyses were performed on three continuous speech stimuli from a reading passage: two full sentences and one constituent phrase. Significant between-group differences were found for both cepstral measures and three LTAS measures (P < 0.001): spectral mean, skewness, and kurtosis. These five measures also showed moderate to strong correlations to overall voice severity. Furthermore, high degrees of within-speaker consistency (correlation coefficients ≥0.89) across utterances with varying length and phonemic content were evidenced for both subject groups.

AB - Spectral- and cepstral-based acoustic measures are preferable to time-based measures for accurately representing dysphonic voices during continuous speech. Although these measures show promising relationships to perceptual voice quality ratings, less is known regarding their ability to differentiate normal from dysphonic voice during continuous speech and the consistency of these measures across multiple utterances by the same speaker. The purpose of this study was to determine whether spectral moments of the long-term average spectrum (LTAS) (spectral mean, standard deviation, skewness, and kurtosis) and cepstral peak prominence measures were significantly different for speakers with and without voice disorders when assessed during continuous speech. The consistency of these measures within a speaker across utterances was also addressed. Continuous speech samples from 27 subjects without voice disorders and 27 subjects with mixed voice disorders were acoustically analyzed. In addition, voice samples were perceptually rated for overall severity. Acoustic analyses were performed on three continuous speech stimuli from a reading passage: two full sentences and one constituent phrase. Significant between-group differences were found for both cepstral measures and three LTAS measures (P < 0.001): spectral mean, skewness, and kurtosis. These five measures also showed moderate to strong correlations to overall voice severity. Furthermore, high degrees of within-speaker consistency (correlation coefficients ≥0.89) across utterances with varying length and phonemic content were evidenced for both subject groups.

KW - Acoustic

KW - Cepstral

KW - Dysphonia

KW - Long-term average spectrum

KW - Spectral

KW - Spectral moments

KW - Voice

UR - http://www.scopus.com/inward/record.url?scp=80052330480&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80052330480&partnerID=8YFLogxK

U2 - 10.1016/j.jvoice.2010.06.007

DO - 10.1016/j.jvoice.2010.06.007

M3 - Article

C2 - 20971612

AN - SCOPUS:80052330480

VL - 25

JO - Journal of Voice

JF - Journal of Voice

SN - 0892-1997

IS - 5

ER -