Noise Adaptive Stream Fusion Based on Feature Component Rejection for Robust Multi-Stream Speech Recognition

被引:0
|
作者
Zhang, Jun [1 ]
Feng, Yizhi [1 ]
Ning, Gengxin [1 ]
Ji, Fei [1 ]
机构
[1] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou, Guangdong, Peoples R China
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Weighting the stream outputs according to their reliability levels is one of the most common stream fusion methods in the multi-stream automatic speech recognition (MS ASR). However, when a MS ASR system works in noisy environments, there are distortion level differences among not only the data streams, but also the feature components inside a stream. In this paper, we first propose a feature component rejection approach that can provide the similar function as the missing data techniques while is much easier to be applied to different features. Then a new stream fusion method that can make use of the reliability information of both inter-and intra-streams is developed by incorporating the proposed feature component rejection approach into the conventional MS HMM. The proposed stream fusion method shows good noise adaptive ability and achieves similar recognition accuracy as the missing data based stream fusion method for additive noises in the experiments of the Ti digits connected word recognition task.
引用
收藏
页码:279 / 283
页数:5
相关论文
共 50 条
  • [41] Multi-stream asynchrony modeling for audio-visual speech recognition
    Lv, Guoyun
    Jiang, Dongmei
    Zhao, Rongchun
    Hou, Yunshu
    ISM 2007: NINTH IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, PROCEEDINGS, 2007, : 37 - 44
  • [42] Rapid feature space speaker adaptation for multi-stream HMM-based audio-visual speech recognition
    Huang, J
    Marcheret, E
    Visweswariah, K
    2005 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), VOLS 1 AND 2, 2005, : 338 - 341
  • [43] Gesture recognition based on Gramian angular difference field and multi-stream fusion methods
    Bian, Huarui
    Zhang, Lei
    SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (01)
  • [44] Multi-Stream Fusion Network for Skeleton-Based Construction Worker Action Recognition
    Tian, Yuanyuan
    Liang, Yan
    Yang, Haibin
    Chen, Jiayu
    SENSORS, 2023, 23 (23)
  • [45] Multi-stream Fusion Model for Social Relation Recognition from Videos
    Lv, Jinna
    Liu, Wu
    Zhou, Lili
    Wu, Bin
    Ma, Huadong
    MULTIMEDIA MODELING, MMM 2018, PT I, 2018, 10704 : 355 - 368
  • [46] Skeleton-Based Action Recognition With Multi-Stream Adaptive Graph Convolutional Networks
    Shi, Lei
    Zhang, Yifan
    Cheng, Jian
    Lu, Hanqing
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 9532 - 9545
  • [47] FIOS: Feature Based I/O Stream Identification for Improving Endurance of Multi-Stream SSDs
    Bhimani, Janki
    Mi, Ningfang
    Yang, Zhengyu
    Yang, Jingpei
    Pandurangan, Rajinikanth
    Choi, Changho
    Balakrishnan, Vijay
    PROCEEDINGS 2018 IEEE 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2018, : 17 - 24
  • [48] Hard-testing the multi-stream approach. to automatic speech recognition
    Pera, V
    Martens, JP
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2003, 2807 : 315 - 320
  • [49] ON DNN POSTERIOR PROBABILITY COMBINATION IN MULTI-STREAM SPEECH RECOGNITION FOR REVERBERANT ENVIRONMENTS
    Xiong, Feifei
    Goetze, Stefan
    Meyer, Bernd T.
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5250 - 5254
  • [50] A stream-weight optimization method for audio-visual speech recognition using multi-stream HMMS
    Tamura, S
    Iwano, K
    Furui, S
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 857 - 860