Noise Adaptive Stream Fusion Based on Feature Component Rejection for Robust Multi-Stream Speech Recognition

被引:0
|
作者
Zhang, Jun [1 ]
Feng, Yizhi [1 ]
Ning, Gengxin [1 ]
Ji, Fei [1 ]
机构
[1] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou, Guangdong, Peoples R China
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Weighting the stream outputs according to their reliability levels is one of the most common stream fusion methods in the multi-stream automatic speech recognition (MS ASR). However, when a MS ASR system works in noisy environments, there are distortion level differences among not only the data streams, but also the feature components inside a stream. In this paper, we first propose a feature component rejection approach that can provide the similar function as the missing data techniques while is much easier to be applied to different features. Then a new stream fusion method that can make use of the reliability information of both inter-and intra-streams is developed by incorporating the proposed feature component rejection approach into the conventional MS HMM. The proposed stream fusion method shows good noise adaptive ability and achieves similar recognition accuracy as the missing data based stream fusion method for additive noises in the experiments of the Ti digits connected word recognition task.
引用
收藏
页码:279 / 283
页数:5
相关论文
共 50 条
  • [31] Multi-stream speech recognition based on Dempster-Shafer combination rule
    Valente, Fabio
    SPEECH COMMUNICATION, 2010, 52 (03) : 213 - 222
  • [32] DBN based multi-stream models for audio-visual speech recognition
    Gowdy, JN
    Subramanya, A
    Bartels, C
    Bilmes, J
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 993 - 996
  • [33] A new multi-stream approach using acoustic and visual features for robust speech recognition system
    Radha, N.
    Shahina, A.
    Khan, A. Nayeemulla
    Velusami, Jansi Rani Sella
    MATERIALS TODAY-PROCEEDINGS, 2022, 62 : 4916 - 4924
  • [34] Multi-Stream Convolution-Recurrent Neural Networks Based on Attention Mechanism Fusion for Speech Emotion Recognition
    Tao, Huawei
    Geng, Lei
    Shan, Shuai
    Mai, Jingchao
    Fu, Hongliang
    ENTROPY, 2022, 24 (08)
  • [35] MSFF-Net: Multi-Stream Feature Fusion Network for surface electromyography gesture recognition
    Peng, Xiangdong
    Zhou, Xiao
    Zhu, Huaqiang
    Ke, Zejun
    Pan, Congcheng
    PLOS ONE, 2022, 17 (11):
  • [36] Masking the feature information in multi-stream speech-analogue displays
    Divenyi, PL
    SPEECH SEPARATION BY HUMANS AND MACHINES, 2005, : 269 - 281
  • [37] Multi-stream fusion for speaker classification
    Shafran, Izhak
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2007, 4343 LNAI : 298 - 312
  • [38] A Novel Feature Extraction Strategy for Multi-stream Robust Emotion Identification
    Liu, Gang
    Lei, Yun
    Hansen, John H. L.
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 482 - 485
  • [39] COMBINATION STRATEGY BASED ON RELATIVE PERFORMANCE MONITORING FOR MULTI-STREAM REVERBERANT SPEECH RECOGNITION
    Xiong, Feifei
    Goetze, Stefan
    Meyer, Bernd T.
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4870 - 4874
  • [40] A multi-stream bimodal continuous speech recognition system using datasieve based features
    Xie, L
    Ravyse, I
    Jiang, DM
    Zhao, RC
    Sahli, H
    Verhelst, W
    Cornelis, J
    2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 2287 - 2290