EXTRACTING DEEP BOTTLENECK FEATURES FOR VISUAL SPEECH RECOGNITION

被引:0
|
作者
Sui, Chao [1 ]
Togneri, Roberto [2 ]
Bennamoun, Mohammed [1 ]
机构
[1] Univ Western Australia, Sch Comp Sci & Software Engn, Nedlands, WA, Australia
[2] Univ Western Australia, Sch Elect Elect & Comp Engn, Nedlands, WA, Australia
关键词
Visual speech recognition; stacked denoising auto-encoder; deep bottleneck feature;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Motivated by the recent progresses in the use of deep learning techniques for acoustic speech recognition, we present in this paper a visual deep bottleneck feature (DBNF) learning scheme using a stacked auto-encoder combined with other techniques. Experimental results show that our proposed deep feature learning scheme yields approximately 24% relative improvement for visual speech accuracy. To the hest of our knowledge, this is the first study which uses deep bottleneck feature on visual speech recognition. Our work firstly shows that the deep bottleneck visual feature is able to achieve a significant accuracy improvement on visual speech recognition.
引用
收藏
页码:1518 / 1522
页数:5
相关论文
共 50 条
  • [1] DEEP COMPLEMENTARY BOTTLENECK FEATURES FOR VISUAL SPEECH RECOGNITION
    Petridis, Stavros
    Pantic, Maja
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 2304 - 2308
  • [2] Integration of Deep Bottleneck Features for Audio-Visual Speech Recognition
    Ninomiya, Hiroshi
    Kitaoka, Norihide
    Tamura, Satoshi
    Iribe, Yurie
    Takeda, Kazuya
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 563 - 567
  • [3] Audio-visual speech recognition using deep bottleneck features and high-performance lipreading
    Tamura, Satoshi
    Ninomiya, Hiroshi
    Kitaoka, Norihide
    Osuga, Shin
    Iribe, Yurie
    Takeda, Kazuya
    Hayamizu, Satoru
    [J]. 2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 575 - 582
  • [4] Robust speech recognition by extracting invariant features
    Eskikand, Parvin Zarei
    Seyyedsalehi, Seyyed Ali
    [J]. 4TH INTERNATIONAL CONFERENCE OF COGNITIVE SCIENCE, 2012, 32 : 230 - 237
  • [5] ON THE USEFULNESS OF STATISTICAL NORMALISATION OF BOTTLENECK FEATURES FOR SPEECH RECOGNITION
    Loweimi, Erfan
    Bell, Peter
    Renals, Steve
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3862 - 3866
  • [6] Noisy speech recognition using blind spatial subtraction array technique and deep bottleneck features
    Kitaoka, Norihide
    Hayashi, Tomoki
    Takeda, Kazuya
    [J]. 2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [7] INCOHERENT TRAINING OF DEEP NEURAL NETWORKS TO DE-CORRELATE BOTTLENECK FEATURES FOR SPEECH RECOGNITION
    Bao, Yebo
    Jiang, Hui
    Dai, Lirong
    Liu, Cong
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6980 - 6984
  • [8] LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION BASED ON WFST STRUCTURED CLASSIFIERS AND DEEP BOTTLENECK FEATURES
    Kubo, Yotaro
    Hori, Takaaki
    Nakamura, Atsushi
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7629 - 7633
  • [9] EXTRACTING DEEP BOTTLENECK FEATURES USING STACKED AUTO-ENCODERS
    Gehring, Jonas
    Miao, Yajie
    Metze, Florian
    Waibel, Alex
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 3377 - 3381
  • [10] Articulatory and Stacked Bottleneck Features for Low Resource Speech Recognition
    Shetty, Vishwas M.
    Sharon, Rini A.
    Abraham, Basil
    Seeram, Tejaswi
    Prakash, Anusha
    Ravi, Nithya
    Umesh, S.
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3202 - 3206