ROBUST SPEAKING RATE ESTIMATION USING BROAD PHONETIC CLASS RECOGNITION

被引:21
|
作者
Yuan, Jiahong [1 ]
Liberman, Mark [1 ]
机构
[1] Univ Penn, Philadelphia, PA 19104 USA
关键词
Speaking rate estimation; syllable detection; robustness; broad phonetic class; SPEECH;
D O I
10.1109/ICASSP.2010.5495686
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Robust speaking rate estimation can be useful in automatic speech recognition and speaker identification, and accurate, automatic measures of speaking rate are also relevant for research in linguistics, psychology, and social sciences. In this study we built a broad phonetic class recognizer for speaking rate estimation. We tested the recognizer on a variety of data sets, including laboratory speech, telephone conversations, foreign accented speech, and speech in different languages, and we found that the recognizer's estimates are robust under these sources of variation. We also found that the acoustic models of the broad phonetic classes are more robust than those of the monophones for syllable detection.
引用
收藏
页码:4222 / 4225
页数:4
相关论文
共 50 条
  • [31] Introduction of the speaking rate in the model of speech recognition
    Yousfi, A
    Meziane, A
    [J]. INTERNATIONAL CONFERENCE ON PARALLEL COMPUTING IN ELECTRICAL ENGINEERING - PARELEC 2000, PROCEEDINGS, 2000, : 64 - 66
  • [32] Pre-recognition measures of speaking rate
    Samudravijaya, K
    Singh, SK
    Rao, PVS
    [J]. SPEECH COMMUNICATION, 1998, 24 (01) : 73 - 84
  • [33] Information theoretic optimal vocal tract region selection from real time magnetic resonance images for broad phonetic class recognition
    Prasad, Abhay
    Ghosh, Prasanta Kumar
    [J]. COMPUTER SPEECH AND LANGUAGE, 2016, 39 : 108 - 128
  • [34] Probabilistic Class Histogram Equalization Based on Posterior Mean Estimation for Robust Speech Recognition
    Suh, Youngjoo
    Kim, Hoirin
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (12) : 2421 - 2424
  • [35] Perceptual speech processing and phonetic feature mapping for robust vowel recognition
    Bu, LK
    Chiueh, TD
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (02): : 105 - 114
  • [36] Dialect Recognition using Adapted Phonetic Models
    Shen, Wade
    Chen, Nancy
    Reynolds, Douglas
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 763 - 766
  • [37] Island-Driven Search Using Broad Phonetic Classes
    Sainath, Tara N.
    [J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 287 - 292
  • [38] ROBUST PARTIAL FACE RECOGNITION USING INSTANCE-TO-CLASS DISTANCE
    Hu, Junlin
    Lu, Jiwen
    Tan, Yap-Peng
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (IEEE VCIP 2013), 2013,
  • [39] Convex Weighting Criteria for Speaking Rate Estimation
    Jiao, Yishan
    Berisha, Visar
    Tu, Ming
    Liss, Julie
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (09) : 1421 - 1430
  • [40] PHONETIC PROTOTYPES - INFLUENCE OF PLACE OF ARTICULATION AND SPEAKING RATE ON THE INTERNAL STRUCTURE OF VOICING CATEGORIES
    VOLAITIS, LE
    MILLER, JL
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1992, 92 (02): : 723 - 735