Noise Adaptive Stream Fusion Based on Feature Component Rejection for Robust Multi-Stream Speech Recognition

被引:0
|
作者
Zhang, Jun [1 ]
Feng, Yizhi [1 ]
Ning, Gengxin [1 ]
Ji, Fei [1 ]
机构
[1] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou, Guangdong, Peoples R China
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Weighting the stream outputs according to their reliability levels is one of the most common stream fusion methods in the multi-stream automatic speech recognition (MS ASR). However, when a MS ASR system works in noisy environments, there are distortion level differences among not only the data streams, but also the feature components inside a stream. In this paper, we first propose a feature component rejection approach that can provide the similar function as the missing data techniques while is much easier to be applied to different features. Then a new stream fusion method that can make use of the reliability information of both inter-and intra-streams is developed by incorporating the proposed feature component rejection approach into the conventional MS HMM. The proposed stream fusion method shows good noise adaptive ability and achieves similar recognition accuracy as the missing data based stream fusion method for additive noises in the experiments of the Ti digits connected word recognition task.
引用
收藏
页码:279 / 283
页数:5
相关论文
共 50 条
  • [1] Autoencoder based multi-stream combination for noise robust speech recognition
    Mallidi, Sri Harish
    Ogawa, Tetsuji
    Vesely, Karel
    Nidadavolu, Phani S.
    Hermansky, Hynek
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3551 - 3555
  • [2] Stream fusion for multi-stream automatic speech recognition
    Sagha, Hesam
    Li, Feipeng
    Variani, Ehsan
    Millan, Jose del R.
    Chavarriaga, Ricardo
    Schuller, Bjoern
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2016, 19 (04) : 669 - 675
  • [3] Robust multi-stream speech recognition based on weighting the output probabilities of feature components
    Zhang, Jun
    Wei, Gang
    Yu, Hua
    Shengxue Xuebao/Acta Acustica, 2008, 33 (02): : 102 - 108
  • [4] Robust multi-stream speech recognition based on weighting the output probabilities of feature components
    ZHANG Jun WEI Gang YU Hua NING Genxin (College of Electronic & Information Engineering
    ChineseJournalofAcoustics, 2009, 28 (03) : 269 - 279
  • [5] SUBBAND HYBRID FEATURE FOR MULTI-STREAM SPEECH RECOGNITION
    Li, Feipeng
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [6] Skeleton Feature Fusion Based on Multi-Stream LSTM for Action Recognition
    Wang, Lei
    Zhao, Xu
    Liu, Yuncai
    IEEE ACCESS, 2018, 6 : 50788 - 50800
  • [7] Multi-stream adaptive evidence combination for noise robust ASR
    Morris, A
    Hagen, A
    Glotin, H
    Bourlard, H
    SPEECH COMMUNICATION, 2001, 34 (1-2) : 25 - 40
  • [8] Robust Speaker Recognition Based on Multi-Stream Features
    Wang, Ning
    Wang, Lei
    2016 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS-CHINA (ICCE-CHINA), 2016,
  • [9] Multi-Stream Spectro-Temporal Features for Robust Speech Recognition
    Zhao, Sherry Y.
    Morgan, Nelson
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 898 - 901
  • [10] Multi-stream Attention-based BLSTM with Feature Segmentation for Speech Emotion Recognition
    Chiba, Yuya
    Nose, Takashi
    Ito, Akinori
    INTERSPEECH 2020, 2020, : 3301 - 3305