EXTRACTING DEEP BOTTLENECK FEATURES USING STACKED AUTO-ENCODERS

被引:0
|
作者
Gehring, Jonas [1 ]
Miao, Yajie [2 ]
Metze, Florian [2 ]
Waibel, Alex [1 ,2 ]
机构
[1] Karlsruhe Inst Technol, Interact Syst Lab, D-76021 Karlsruhe, Germany
[2] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
关键词
Bottleneck features; Deep learning; Auto-encoders; NETWORKS;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this work, a novel training scheme for generating bottleneck features from deep neural networks is proposed. A stack of denoising auto-encoders is first trained in a layer-wise, unsupervised manner. Afterwards, the bottleneck layer and an additional layer are added and the whole network is fine-tuned to predict target phoneme states. We perform experiments on a Cantonese conversational telephone speech corpus and find that increasing the number of auto-encoders in the network produces more useful features, but requires pre-training, especially when little training data is available. Using more unlabeled data for pre-training only yields additional gains. Evaluations on larger datasets and on different system setups demonstrate the general applicability of our approach. In terms of word error rate, relative improvements of 9.2% (Cantonese, ML training), 9.3% (Tagalog, BMMI-SAT training), 12% (Tagalog, confusion network combinations with MFCCs), and 8.7% (Switchboard) are achieved.
引用
收藏
页码:3377 / 3381
页数:5
相关论文
共 50 条
  • [1] Extracting and inserting knowledge into stacked denoising auto-encoders
    Yu, Jianbo
    Liu, Guoliang
    [J]. NEURAL NETWORKS, 2021, 137 : 31 - 42
  • [2] Bankruptcy Prediction Using Stacked Auto-Encoders
    Soui, Makram
    Smiti, Salima
    Mkaouer, Mohamed Wiem
    Ejbali, Ridha
    [J]. APPLIED ARTIFICIAL INTELLIGENCE, 2020, 34 (01) : 80 - 100
  • [3] Classification of medical images based on deep stacked patched auto-encoders
    Ramzi Ben Ali
    Ridha Ejbali
    Mourad Zaied
    [J]. Multimedia Tools and Applications, 2020, 79 : 25237 - 25257
  • [4] Using Different Cost Functions to Train Stacked Auto-encoders
    Amaral, Telmo
    Silva, Luis M.
    Alexandre, Lus A.
    Kandaswamy, Chetak
    Santos, Jorge M.
    de Sa, Joaquim Marques
    [J]. 2013 12TH MEXICAN INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (MICAI 2013), 2013, : 114 - 120
  • [5] Stacked auto-encoders based visual features for speech/music classification
    Kumar, Arvind
    Solanki, Sandeep Singh
    Chandra, Mahesh
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 208
  • [6] Automatic Modulation Classification using Stacked Sparse Auto-Encoders
    Dai, Ao
    Zhang, Haijian
    Sun, Hong
    [J]. PROCEEDINGS OF 2016 IEEE 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2016), 2016, : 248 - 252
  • [7] Complete Stacked Denoising Auto-Encoders for Regression
    María-Elena Fernández-García
    José-Luis Sancho-Gómez
    Antonio Ros-Ros
    Aníbal R. Figueiras-Vidal
    [J]. Neural Processing Letters, 2021, 53 : 787 - 797
  • [8] Classification of medical images based on deep stacked patched auto-encoders
    Ben Ali, Ramzi
    Ejbali, Ridha
    Zaied, Mourad
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (35-36) : 25237 - 25257
  • [9] Complete Stacked Denoising Auto-Encoders for Regression
    Fernandez-Garcia, Maria-Elena
    Sancho-Gomez, Jose-Luis
    Ros-Ros, Antonio
    Figueiras-Vidal, Anibal R.
    [J]. NEURAL PROCESSING LETTERS, 2021, 53 (01) : 787 - 797
  • [10] Extractive Text Summarization Using Deep Auto-encoders
    Arjun, K.
    Hariharan, M.
    Anand, Pooja
    Pradeep, V
    Raj, Reshma
    Mohan, Anuraj
    [J]. RECENT FINDINGS IN INTELLIGENT COMPUTING TECHNIQUES, VOL 3, 2018, 709 : 169 - 176