Recognizing GSM digital speech

被引:9
|
作者
Gallardo-Antolín, A [1 ]
Peláez-Moreno, C [1 ]
Díaz-de-María, F [1 ]
机构
[1] Univ Carlos III Madrid, Signal Theory & Commun Dept, Leganes 28911, Madrid, Spain
来源
关键词
coding distortion; Global System for Mobile (GSM) networks; speech coding; speech recognition; tandeming; transmission errors; wireless networks;
D O I
10.1109/TSA.2005.853210
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The Global System for Mobile (GSM) environment encompasses three main problems for automatic speech cognition (ASR) systems: noisy scenarios, source coding distortion, and transmission errors. The first one has already received much attention; however, source coding distortion and transmission errors must be explicitly addressed. In this paper, we propose an alternative front-end for speech recognition over GSM networks. This front-end is specially conceived to be effective against source coding distortion and transmission errors. Specifically, we suggest extracting the recognition feature vectors directly from the encoded speech (i.e., the bitstream) instead of decoding it and subsequently extracting the feature vectors. This approach offers two significant advantages. First, the recognition system is only affected by the quantization distortion of the spectral envelope. Thus, we are avoiding the influence of other sources of distortion as a result of the encoding-decoding process. Second, when transmission errors occur, our front-end becomes more effective since it is not affected by errors in bits allocated to the excitation signal. We have considered the half and the full-rate standard codecs and compared the proposed front-end with the conventional approach in two ASR tasks, namely, speaker-independent isolated digit recognition and speaker-independent continuous speech recognition. In general, our approach outperforms the conventional procedure, for a variety of simulated channel conditions. Furthermore, the disparity increases as the network conditions worsen.
引用
收藏
页码:1186 / 1205
页数:20
相关论文
共 50 条
  • [1] Recognizing Uncertainty in Speech
    Heather Pon-Barry
    Stuart M. Shieber
    EURASIP Journal on Advances in Signal Processing, 2011
  • [2] Recognizing Uncertainty in Speech
    Pon-Barry, Heather
    Shieber, Stuart M.
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2011,
  • [3] Recognizing emotion in speech
    Dellaert, F
    Polzin, T
    Waibel, A
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1970 - 1973
  • [4] To separate speech a system for recognizing simultaneous speech
    McDonough, John
    Kumatani, Kenichi
    Gehrig, Tobias
    Stoimenov, Emilian
    Mayer, Uwe
    Schacht, Stefan
    Woelfel, Matthias
    Klakow, Dietrich
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2008, 4892 : 283 - +
  • [5] Speech quality in GSM systems
    Cruchant, L
    Dupuy, P
    ALCATEL TELECOMMUNICATIONS REVIEW, 1998, (04): : 281 - 285
  • [6] SPEECH PROCESSING IN A GSM TERMINAL
    TYNDALL, M
    ELECTRONIC ENGINEERING, 1992, 64 (786): : 87 - 88
  • [7] SEGMENTING SPEECH BY RECOGNIZING WORDS
    HENLY, A
    NUSBAUM, H
    BULLETIN OF THE PSYCHONOMIC SOCIETY, 1991, 29 (06) : 482 - 482
  • [8] RECOGNIZING THE AUTUMNAL VEGETATION OF SPEECH
    Zalomkina, Galina
    NOVOE LITERATURNOE OBOZRENIE, 2015, (131): : 270 - 279
  • [9] Recognizing disfluencies in conversational speech
    Lease, Matthew
    Johnson, Mark
    Charniak, Eugene
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05): : 1566 - 1573
  • [10] DIGITAL SHORT-RANGE RADIO - CHANNEL CODING AND SYNCHRONIZATION FOR SPEECH USING THE GSM CODEC
    SHEPHERD, R
    HOLMES, P
    MOULSLEY, TJ
    FREIJ, GJ
    FIFTH INTERNATIONAL CONFERENCE ON MOBILE RADIO AND PERSONAL COMMUNICATIONS, 1989, 315 : 127 - 131