Recognition of coded speech transmitted over wireless channels

被引:0
|
作者
Gomez, Angel M. [1 ]
Peinado, Antonio M. [1 ]
Sanchez, Victoria [1 ]
Rubio, Antonio J. [1 ]
机构
[1] Univ Granada, Dept Teoria Senal Telemat & Comunicac, Fac Ciencias, E-18071 Granada, Spain
关键词
speech recognition; remote speech recognition; cellular radio; speech codecs; transmission errors; decoding; decoded speech signal; error compensation; transcoding;
D O I
10.1109/TWC.2006.1687779
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Network-based speech recognition (NSR) and distributed speech recognition (DSR) have been proposed as solutions to translate speech recognition technologies to mobile environments. NSR is the most straightforward solution since it does not require any modification in the mobile phone, however DSR offers higher robustness against codec compression and transmission channel degradation. This paper explores an alternative approach for remote speech recognition which combines the advantages of NSR and DSR. In this scheme, a standard speech codec is used for speech transmission but the recognition is performed from the received codec parameters. In particular, we focus on the effect of transmission channel errors, which can cause a more severe performance reduction on speech recognition than codec distortion. First, we show that an NSR solution can approach DSR through a reconstruction technique along with an adapted noise reduction technique originally proposed for acoustic noise. Then, these results are improved by working with recognition features directly extracted from the codec bitstream by means of parameter transcoding. Required modifications on current networks in order to access the bitstream are described. The network upgrading with the tandem free operation (TF) protocol is an attractive solution. This upgrade not only offers an overall improvement on the end-to-end speech quality, but would also allow a recognition performance similar, and even higher in poor channel conditions, to that obtained by DSR when parameter transcoding along with the proposed mitigation techniques are applied.
引用
收藏
页码:2555 / 2562
页数:8
相关论文
共 50 条
  • [41] Joint mode selection and unequal error protection for bitplane coded video transmission over wireless channels
    Cai, JF
    Wu, JH
    Ngan, KN
    He, ZH
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2005, 16 (4-5) : 412 - 431
  • [42] A cross-layer approach for energy efficient transmission of progressively coded images over wireless channels
    Costa, C
    Granelli, F
    Katsaggelos, AK
    2005 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), VOLS 1-5, 2005, : 1037 - 1040
  • [43] Compensation of speech coding distortion for wireless speech recognition
    Kim, HK
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (06): : 1596 - 1600
  • [44] Improving speech detection robustness for wireless speech recognition
    Karray, L
    Mauuary, L
    1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 428 - 435
  • [45] Predicting Automatic Speech Recognition Performance over Communication Channels from Instrumental Speech Quality and Intelligibility Scores
    Gallardo, Laura Fernandez
    Moeller, Sebastian
    Beerends, John
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2939 - 2943
  • [46] A robust scheme for distributed speech recognition over loss-prone packet channels
    Gomez, Angel M.
    Peinado, Antonio M.
    Sanchez, Victoria
    Carmona, Jose L.
    SPEECH COMMUNICATION, 2009, 51 (04) : 390 - 400
  • [47] Source and channel coding for remote speech recognition over error-prone channels
    Bernard, A
    Alwan, A
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 2613 - 2616
  • [48] Speech recognition: The wireless interface revolution
    Clark, D
    COMPUTER, 2001, 34 (03) : 16 - 18
  • [49] A codec for speech recognition in a wireless system
    Reichl, W
    Weerackody, V
    Potamianos, A
    IEEE/AFCEA EUROCOMM 2000, CONFERENCE RECORD: INFORMATION SYSTEMS FOR ENHANCED PUBLIC SAFETY AND SECURITY, 2000, : 34 - 37
  • [50] CHANNEL-OPTIMIZED ERROR MITIGATION FOR DISTRIBUTED SPEECH RECOGNITION OVER WIRELESS NETWORKS
    Lee, Cheng-Lung
    Chang, Wen-Whei
    JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 2009, 32 (01) : 45 - 51