Formant measurement in children's speech based on spectral filtering

被引:22
|
作者
Story, Brad H. [1 ]
Bunton, Kate [1 ]
机构
[1] Univ Arizona, Dept Speech Language & Hearing Sci, Speech Acoust Lab, POB 210071, Tucson, AZ 85721 USA
基金
美国国家科学基金会;
关键词
Formant; Vocal tract; Speech analysis; Children's speech; Speech modeling; LINEAR PREDICTION; VOCAL-TRACT; FREQUENCY; VOWELS; SIMULATION; MODEL; COORDINATION; HARMONICS; CEPSTRUM; AIRWAY;
D O I
10.1016/j.specom.2015.11.001
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Children's speech presents a challenging problem for formant frequency measurement. In part, this is because high fundamental frequencies, typical of a children's speech production, generate widely spaced harmonic components that may undersample the spectral shape of the vocal tract transfer function. In addition, there is often a weakening of upper harmonic energy and a noise component due to glottal turbulence. The purpose of this study was to develop a formant measurement technique based on cepstral analysis that does not require modification of the cepstrum itself or transformation back to the spectral domain. Instead, a narrow-band spectrum is low-pass filtered with a cutoff point (i.e., cutoff "quefrency" in the terminology of cepstral analysis) to preserve only the spectral envelope. To test the method, speech representative of a 2-3 year-old child was simulated with an airway modulation model of speech production. The model, which includes physiologically-scaled vocal folds and vocal tract, generates sound output analogous to a microphone signal. The vocal tract resonance frequencies can be calculated independently of the output signal and thus provide test cases that allow for assessing the accuracy of the formant tracking algorithm. When applied to the simulated child-like speech, the spectral filtering approach was shown to provide a clear spectrographic representation of formant change over the time course of the signal, and facilitates tracking formant frequencies for further analysis. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:93 / 111
页数:19
相关论文
共 50 条
  • [41] An Automatic Watermarking in CELP Speech Codec Based on Formant Tuning
    Alvarez, Erick Christian Garcia
    Wang, Shengbei
    Unoki, Masashi
    [J]. 2015 INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING (IIH-MSP), 2015, : 160 - 163
  • [42] A formant frequency estimator for noisy speech based on correlation and cepstrum
    Dept. of Electrical and Computer Engineering, Concordia University, 1455 De Maisonneuve Blvd. W., Montreal, QC H3G 1M8, Canada
    [J]. Can Acoust, 2008, 3 (160-161):
  • [43] Exploring the Role of Spectral Smoothing in context of Children's Speech Recognition
    Ghai, Shweta
    Sinha, Rohit
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1571 - 1574
  • [44] Acoustics of children's speech: Developmental changes of temporal and spectral parameters
    Lee, S
    Potamianos, A
    Narayanan, S
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1999, 105 (03): : 1455 - 1468
  • [45] Spectral Modification Based Data Augmentation for Improving End-to-End ASR for Children's Speech
    Singh, Vishwanath Pratap
    Sailor, Hardik
    Bhattacharya, Supratik
    Pandey, Abhishek
    [J]. INTERSPEECH 2022, 2022, : 3213 - 3217
  • [46] Speech perception in children with cochlear implants for continua varying in formant transition duration
    Blankenship, Kathryn Guillot
    Ohde, Ralph N.
    Won, Jong Ho
    Hedrick, Mark
    [J]. INTERNATIONAL JOURNAL OF SPEECH-LANGUAGE PATHOLOGY, 2018, 20 (02) : 238 - 246
  • [47] Atypical brainstem representation of onset and formant structure of speech sounds in children with language-based learning problems
    Wible, B
    Nicol, T
    Kraus, N
    [J]. BIOLOGICAL PSYCHOLOGY, 2004, 67 (03) : 299 - 317
  • [48] Effect of Linear Prediction Order to Modify Formant Locations for Children Speech Recognition
    Kumar, Udara Laxman
    Kurimo, Mikko
    Kathania, Hemant Kumar
    [J]. SPEECH AND COMPUTER, SPECOM 2023, PT I, 2023, 14338 : 483 - 493
  • [49] Formant position based weighted spectral features for emotion recognition
    Bozkurt, Elif
    Erzin, Engin
    Erdem, Cigdem Eroglu
    Erdem, A. Tanju
    [J]. SPEECH COMMUNICATION, 2011, 53 (9-10) : 1186 - 1197
  • [50] A study of pitch, formant, and spectral estimation errors introduced by three lossy speech compression algorithms
    van Son, RJJH
    [J]. ACTA ACUSTICA UNITED WITH ACUSTICA, 2005, 91 (04) : 771 - 778