ROBUST SPEAKING RATE ESTIMATION USING BROAD PHONETIC CLASS RECOGNITION

被引:21
|
作者
Yuan, Jiahong [1 ]
Liberman, Mark [1 ]
机构
[1] Univ Penn, Philadelphia, PA 19104 USA
关键词
Speaking rate estimation; syllable detection; robustness; broad phonetic class; SPEECH;
D O I
10.1109/ICASSP.2010.5495686
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Robust speaking rate estimation can be useful in automatic speech recognition and speaker identification, and accurate, automatic measures of speaking rate are also relevant for research in linguistics, psychology, and social sciences. In this study we built a broad phonetic class recognizer for speaking rate estimation. We tested the recognizer on a variety of data sets, including laboratory speech, telephone conversations, foreign accented speech, and speech in different languages, and we found that the recognizer's estimates are robust under these sources of variation. We also found that the acoustic models of the broad phonetic classes are more robust than those of the monophones for syllable detection.
引用
收藏
页码:4222 / 4225
页数:4
相关论文
共 50 条
  • [1] A Comparison of Broad Phonetic and Acoustic Units for Noise Robust Segment-Based Phonetic Recognition
    Sainath, Tara N.
    Zue, Victor
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2378 - 2381
  • [2] Automatic Syllable Segmentation using Broad Phonetic Class Information
    Ludusan, Bogdan
    Dupoux, Emmanuel
    [J]. SLTU-2016 5TH WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGIES FOR UNDER-RESOURCED LANGUAGES, 2016, 81 : 101 - 106
  • [3] Broad phonetic class recognition in a Hidden Markov Model framework using extended Baum-Welch transformations
    Sainath, Tara N.
    Kanevsky, Dimitri
    Ramabhadran, Bhuvana
    [J]. 2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 306 - 311
  • [4] Using broad phonetic group-experts for improved speech recognition
    Scanlon, Patricia
    Ellis, Daniel P. W.
    Reilly, Richard B.
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 803 - 812
  • [5] Using Broad Phonetic Classes to Guide Search in Automatic Speech Recognition
    Ziegler, Stefan
    Ludusan, Bogdan
    Gravier, Guillaume
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1022 - 1025
  • [6] SOME EFFECTS OF SPEAKING RATE ON PHONETIC PERCEPTION
    MILLER, JL
    [J]. PHONETICA, 1981, 38 (1-3) : 159 - 180
  • [7] EFFECT OF SPEAKING RATE ON THE PERCEPTUAL STRUCTURE OF A PHONETIC CATEGORY
    MILLER, JL
    VOLAITIS, LE
    [J]. PERCEPTION & PSYCHOPHYSICS, 1989, 46 (06): : 505 - 512
  • [8] Broad phonetic class definition driven by phone confusions
    Lopes, Carla
    Perdigao, Fernando
    [J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2012, : 1 - 12
  • [9] Broad phonetic class definition driven by phone confusions
    Carla Lopes
    Fernando Perdigão
    [J]. EURASIP Journal on Advances in Signal Processing, 2012
  • [10] Internal structure of phonetic categories: Effects of speaking rate
    Miller, JL
    ORourke, TB
    Volaitis, LE
    [J]. PHONETICA, 1997, 54 (3-4) : 121 - 137