Improved Bottleneck Features Using Pretrained Deep Neural Networks

被引:0
|
作者
Yu, Dong
Seltzer, Michael L.
机构
关键词
bottleneck features; pretraining; deep neural network; deep belief network; NECK FEATURES; LVCSR;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Bottleneck features have been shown to be effective, in improving the accuracy of automatic speech recognition (ASR) systems. Conventionally, bottleneck features are extracted from a multi-layer perceptron (MLP) trained to predict context-independent monophone states. The MLP typically has three hidden layers and is trained using the backpropagation algorithm. In this paper, we propose two improvements to the training of bottleneck features motivated by recent advances in the use of deep neural networks (DNNs) for speech recognition. First, we show how the use of unsupervised pretraining of a DNN enhances the network's discriminative power and improves the bottleneck features it generates. Second, we show that a neural network trained to predict context-dependent senone targets produces better bottleneck features than one trained to predict monophone states. Bottleneck features trained using the proposed methods produced a 16% relative reduction in sentence error rate over conventional bottleneck features on a large vocabulary business search task.
引用
收藏
页码:244 / 247
页数:4
相关论文
共 50 条
  • [41] Compressing Neural Networks using the Variational Information Bottleneck
    Dai, Bin
    Zhu, Chen
    Guo, Baining
    Wipf, David
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [42] Predictive performance of radiomic models based on features extracted from pretrained deep networks
    Aydin Demircioğlu
    [J]. Insights into Imaging, 13
  • [43] Combining Speech Features for Aggression Detection Using Deep Neural Networks
    Jaafar, Noussaiba
    Lachiri, Zied
    [J]. 2020 5TH INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP'2020), 2020,
  • [44] Distinctive Phonetic Features Modeling and Extraction Using Deep Neural Networks
    Seddiq, Yasser
    Alotaibi, Yousef A.
    Selouani, Sid-Ahmed
    Meftah, Ali Hamid
    [J]. IEEE ACCESS, 2019, 7 : 81382 - 81396
  • [45] Decoding Imagined Speech using Wavelet Features and Deep Neural Networks
    Panachakel, Jerrin Thomas
    Ramakrishnan, A. G.
    Ananthapadmanabha, T., V
    [J]. 2019 IEEE 16TH INDIA COUNCIL INTERNATIONAL CONFERENCE (IEEE INDICON 2019), 2019,
  • [46] Sound Event Detection Using Derivative Features in Deep Neural Networks
    Kwak, Jin-Yeol
    Chung, Yong-Joo
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (14):
  • [47] Deep Learning Earth Observation Classification Using ImageNet Pretrained Networks
    Marmanis, Dimitrios
    Datcu, Mihai
    Esch, Thomas
    Stilla, Uwe
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2016, 13 (01) : 105 - 109
  • [48] Music emotion recognition using recurrent neural networks and pretrained models
    Jacek Grekow
    [J]. Journal of Intelligent Information Systems, 2021, 57 : 531 - 546
  • [49] DETECTING ALZHEIMER'S DISEASE FROM SPEECH USING NEURAL NETWORKS WITH BOTTLENECK FEATURES AND DATA AUGMENTATION
    Liu, Zhaoci
    Guo, Zhiqiang
    Ling, Zhenhua
    Li, Yunxia
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7323 - 7327
  • [50] Exudate Detection for Diabetic Retinopathy Using Pretrained Convolutional Neural Networks
    Mateen, Muhammad
    Wen, Junhao
    Nasrullah, Nasrullah
    Sun, Song
    Hayat, Shaukat
    [J]. COMPLEXITY, 2020, 2020