Noise Adaptive Stream Fusion Based on Feature Component Rejection for Robust Multi-Stream Speech Recognition

被引:0
|
作者
Zhang, Jun [1 ]
Feng, Yizhi [1 ]
Ning, Gengxin [1 ]
Ji, Fei [1 ]
机构
[1] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou, Guangdong, Peoples R China
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Weighting the stream outputs according to their reliability levels is one of the most common stream fusion methods in the multi-stream automatic speech recognition (MS ASR). However, when a MS ASR system works in noisy environments, there are distortion level differences among not only the data streams, but also the feature components inside a stream. In this paper, we first propose a feature component rejection approach that can provide the similar function as the missing data techniques while is much easier to be applied to different features. Then a new stream fusion method that can make use of the reliability information of both inter-and intra-streams is developed by incorporating the proposed feature component rejection approach into the conventional MS HMM. The proposed stream fusion method shows good noise adaptive ability and achieves similar recognition accuracy as the missing data based stream fusion method for additive noises in the experiments of the Ti digits connected word recognition task.
引用
收藏
页码:279 / 283
页数:5
相关论文
共 50 条
  • [21] Adaptive Stream Fusion in Multistream Recognition of Speech
    Mesgarani, Nima
    Thomas, Samuel
    Hermansky, Hynek
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2340 - +
  • [22] Multi-stream fusion network for continuous gesture recognition based on sEMG
    Li J.
    Zou C.
    Tang D.
    Sun Y.
    Fan H.
    Li B.
    Tang X.
    International Journal of Wireless and Mobile Computing, 2024, 26 (04): : 374 - 383
  • [23] DBN based multi-stream models for speech
    Zhang, YM
    Diao, Q
    Huang, S
    Hu, W
    Bartels, C
    Bilmes, J
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 836 - 839
  • [24] Human Muscle sEMG Signal and Gesture Recognition Technology Based on Multi-Stream Feature Fusion Network
    Wang, Xiaoyun
    EAI Endorsed Transactions on Pervasive Health and Technology, 2024, 10
  • [25] Multi-stream articulator model with adaptive reliability measure for audio visual speech recognition
    Xie, Lei
    Liu, Zhi-Qiang
    ADVANCES IN MACHINE LEARNING AND CYBERNETICS, 2006, 3930 : 994 - 1004
  • [26] Adaptive multi-stream relaying
    Onat, Furuzan Atay
    Yanikomeroglu, Halim
    Periyalwar, Shalini
    2006 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, VOLS 1-5, 2006, : 1691 - +
  • [27] Automated speech recognition by multi-stream dynamic time warping
    Mohamadi, T
    Gharbi, AH
    Mezaache, S
    Harrag, A
    CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING 2001, VOLS I AND II, CONFERENCE PROCEEDINGS, 2001, : 527 - 531
  • [28] Multi-stream acoustic model adaptation for noisy speech recognition
    Tamura, Satoshi
    Hayamizu, Satoru
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [29] Continuous electromyographic speech recognition with a multi-stream decoding architecture
    Jou, Szu-Chen Stan
    Schultz, Tanja
    Waibel, Alex
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 401 - +
  • [30] End-to-End Speech Recognition Technology Based on Multi-Stream CNN
    Xiao, Hao
    Qiu, Yuan
    Fei, Rong
    Chen, Xiongbo
    Liu, Zuo
    Wu, Zongling
    2022 IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, 2022, : 1310 - 1315