A stream-weight optimization method for multi-stream HMMS based on likelihood value normalization

被引:0
|
作者
Tamura, S [1 ]
Iwano, K [1 ]
Furui, S [1 ]
机构
[1] Tokyo Inst Technol, Dept Comp Sci, Meguro Ku, Tokyo 1528552, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the field of audio-visual speech recognition, multi-stream HMMs are widely used, thus bow to automatically and properly determine stream weight factors using a small data set becomes an important research issue. This paper proposes a new stream-weight optimization method based on an output likelihood normalization criterion. In this method, the stream weights are adjusted to equalize the mean values of log likelihood for all HMMs. based on likelihood-ratio maximization which achieved significant improvement by using, a large optimization data set. The new method is evaluated using Japanese connected digit speech recorded in real-world environments. Using 10 seconds speech data for stream-weight optimization, a 10% absolute accuracy improvement is achieved compared to the result before optimization. By additionally applying the MLLR (maximum likelihood linear regression) adaptation, a 23% improvement is obtained over the audio-only scheme.
引用
收藏
页码:469 / 472
页数:4
相关论文
共 50 条
  • [41] Background Knowledge Based Multi-Stream Neural Network for Text Classification
    Ren, Fuji
    Deng, Jiawen
    APPLIED SCIENCES-BASEL, 2018, 8 (12):
  • [42] Elderly fall detection based on multi-stream deep convolutional networks
    Chadia Khraief
    Faouzi Benzarti
    Hamid Amiri
    Multimedia Tools and Applications, 2020, 79 : 19537 - 19560
  • [43] Multi-stream fusion network for continuous gesture recognition based on sEMG
    Li J.
    Zou C.
    Tang D.
    Sun Y.
    Fan H.
    Li B.
    Tang X.
    International Journal of Wireless and Mobile Computing, 2024, 26 (04): : 374 - 383
  • [44] FPGA-based Adaptive Computing for Correlated Multi-stream Processing
    Liu, Ming
    Lu, Zhonghai
    Kuehn, Wolfgang
    Jantsch, Axel
    2010 DESIGN, AUTOMATION & TEST IN EUROPE (DATE 2010), 2010, : 973 - 976
  • [45] DBN-based multi-stream models for Mandarin toneme recognition
    Lei, X
    Ji, G
    Ng, T
    Bilmes, J
    Ostendorf, M
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 349 - 352
  • [46] Multi-stream Gaussian Mixture Model based Facial Feature Localization
    Kumatani, Kenichi
    Ekenel, Hazim K.
    Gao, Hua
    Stiefelhagen, Rainer
    Ercil, Aytuel
    2008 IEEE 16TH SIGNAL PROCESSING, COMMUNICATION AND APPLICATIONS CONFERENCE, VOLS 1 AND 2, 2008, : 869 - +
  • [47] Elderly fall detection based on multi-stream deep convolutional networks
    Khraief, Chadia
    Benzarti, Faouzi
    Amiri, Hamid
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (27-28) : 19537 - 19560
  • [48] Separating Multi-Stream Signals Based on Space-Time Isomerism
    Jin, Liang
    Lou, Yangming
    Xu, Xiaoming
    Zhong, Zhou
    Wang, Hu
    2020 12TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP), 2020, : 418 - 423
  • [49] Skeleton Feature Fusion Based on Multi-Stream LSTM for Action Recognition
    Wang, Lei
    Zhao, Xu
    Liu, Yuncai
    IEEE ACCESS, 2018, 6 : 50788 - 50800
  • [50] Autoencoder based multi-stream combination for noise robust speech recognition
    Mallidi, Sri Harish
    Ogawa, Tetsuji
    Vesely, Karel
    Nidadavolu, Phani S.
    Hermansky, Hynek
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3551 - 3555