Formant measurement in children's speech based on spectral filtering

被引:22
|
作者
Story, Brad H. [1 ]
Bunton, Kate [1 ]
机构
[1] Univ Arizona, Dept Speech Language & Hearing Sci, Speech Acoust Lab, POB 210071, Tucson, AZ 85721 USA
基金
美国国家科学基金会;
关键词
Formant; Vocal tract; Speech analysis; Children's speech; Speech modeling; LINEAR PREDICTION; VOCAL-TRACT; FREQUENCY; VOWELS; SIMULATION; MODEL; COORDINATION; HARMONICS; CEPSTRUM; AIRWAY;
D O I
10.1016/j.specom.2015.11.001
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Children's speech presents a challenging problem for formant frequency measurement. In part, this is because high fundamental frequencies, typical of a children's speech production, generate widely spaced harmonic components that may undersample the spectral shape of the vocal tract transfer function. In addition, there is often a weakening of upper harmonic energy and a noise component due to glottal turbulence. The purpose of this study was to develop a formant measurement technique based on cepstral analysis that does not require modification of the cepstrum itself or transformation back to the spectral domain. Instead, a narrow-band spectrum is low-pass filtered with a cutoff point (i.e., cutoff "quefrency" in the terminology of cepstral analysis) to preserve only the spectral envelope. To test the method, speech representative of a 2-3 year-old child was simulated with an airway modulation model of speech production. The model, which includes physiologically-scaled vocal folds and vocal tract, generates sound output analogous to a microphone signal. The vocal tract resonance frequencies can be calculated independently of the output signal and thus provide test cases that allow for assessing the accuracy of the formant tracking algorithm. When applied to the simulated child-like speech, the spectral filtering approach was shown to provide a clear spectrographic representation of formant change over the time course of the signal, and facilitates tracking formant frequencies for further analysis. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:93 / 111
页数:19
相关论文
共 50 条
  • [31] Dual microphone-based speech enhancement by spectral classification and Wiener filtering
    Jeong, S.
    Lee, S.
    Hahn, M.
    [J]. ELECTRONICS LETTERS, 2008, 44 (03) : 253 - 254
  • [32] Cross-spectral based formant estimation and alignment
    Nelson, DJ
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SENSOR ARRAY AND MULTICHANNEL SIGNAL PROCESSING SIGNAL PROCESSING THEORY AND METHODS, 2004, : 621 - 624
  • [33] Measurement of Formant Frequency in /hVd/ Words of Distorted Speech in Adult New Zealanders
    Sabaee, Maryam E.
    Sharifzadeh, Hamid
    Ardekani, Iman
    Allen, Jacqueline
    [J]. 2021 IEEE REGION 10 SYMPOSIUM (TENSYMP), 2021,
  • [34] Filtering the time sequences of spectral parameters for speech recognition
    Nadeu, C
    Paches-Leal, P
    Juang, BH
    [J]. SPEECH COMMUNICATION, 1997, 22 (04) : 315 - 332
  • [35] Vowel onset point detection for noisy speech using spectral energy at formant frequencies
    Vuppala A.K.
    Rao K.S.
    [J]. International Journal of Speech Technology, 2013, 16 (02) : 229 - 235
  • [36] A Triangular-Matrix-Based Spectral Encoding Method for Broadband Filtering and Reconstruction-Based Spectral Measurement
    Yue, Pinliang
    Wang, Xiaoxu
    [J]. SENSORS, 2024, 24 (04)
  • [37] Thai Speech Synthesis Based on Formant Synthesis for Home Robot
    Khorinphan, Chaiyong
    Saiyod, Saiyan
    Wayalun, Pichet
    [J]. 2016 20TH INTERNATIONAL COMPUTER SCIENCE AND ENGINEERING CONFERENCE (ICSEC), 2016,
  • [38] Hybrid Speech Watermarking based on Formant Enhancement and Cochlear Delay
    Wang, Shengbei
    Unoki, Masashi
    [J]. 2014 TENTH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING (IIH-MSP 2014), 2014, : 272 - 275
  • [39] Speech endpoint detection based on the formant-consonance energy
    Department of Science and Technology of Electronics, University of Science and Technology of China, Hefei 230027, China
    [J]. Qinghua Daxue Xuebao, 2008, SUPPL. 1 (754-759):
  • [40] Algorithms for Vowel Recognition in Fluent Speech Based on Formant Positions
    Stanek, Miroslav
    Polak, Ladislav
    [J]. 2013 36TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2013, : 521 - 525