Gating Recurrent Enhanced Memory Neural Networks on Language Identification

被引:4
|
作者
Geng, Wang [1 ]
Zhao, Yuanyan [1 ]
Wang, Wenfu [1 ]
Cai, Xinyuan [1 ]
Xu, Bo [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, Interact Digital Media Technol Res Ctr, Beijing 100190, Peoples R China
关键词
language identification; gating recurrent neural networks; learnable enhanced memory block; SortaGrad-like training approach; hard sample score acquisition; SPEAKER ADAPTATION; FEATURES;
D O I
10.21437/Interspeech.2016-684
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes a novel memory neural network structure, namely gating recurrent enhanced memory network (GREMN), to model long-range dependency in temporal series on language identification (LID) task at the acoustic frame level. The proposed GREMN is a stacking gating recurrent neural network (RNN) equipped with a learnable enhanced memory block near the classifier. It aims at capturing the long-span history and certain future contextual information of the sequential input. In addition, two optimization strategies of coherent SortaGrad-like training mechanism and a hard sample score acquisition approach are proposed. The proposed optimization policies drastically boost this memory network based LID system, especially on the large disparity training materials. It is confirmed by the experimental results that the proposed GREMN possesses strong ability of sequential modeling and generalization, where about 5% relative equal error rate (EER) reduction is obtained comparing with the approximate-sized gating RNNs and 38.5% performance improvements is observed compared to conventional i-Vector based LID system.
引用
收藏
页码:3280 / 3284
页数:5
相关论文
共 50 条
  • [1] Automatic language identification with recurrent neural networks
    Braun, J
    Levkowitz, H
    IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE, 1998, : 2184 - 2189
  • [2] Theory of Gating in Recurrent Neural Networks
    Krishnamurthy, Kamesh
    Can, Tankut
    Schwab, David J.
    PHYSICAL REVIEW X, 2022, 12 (01)
  • [3] Recurrent Neural Networks with External Memory for Spoken Language Understanding
    Peng, Baolin
    Yao, Kaisheng
    Jing, Li
    Wong, Kam-Fai
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2015, 2015, 9362 : 25 - 35
  • [4] Language Identification Using Deep Convolutional Recurrent Neural Networks
    Bartz, Christian
    Herold, Tom
    Yang, Haojin
    Meinel, Christoph
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT VI, 2017, 10639 : 880 - 889
  • [5] Improving the Gating Mechanism of Recurrent Neural Networks
    Gu, Albert
    Gulcehre, Caglar
    Paine, Tom
    Hoffman, Matt
    Pascanu, Razvan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [6] Improving the Gating Mechanism of Recurrent Neural Networks
    Gu, Albert
    Gulcehre, Caglar
    Paine, Tom
    Hoffman, Matt
    Pascanu, Razvan
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [7] Photonic Recurrent Neural Networks with Gating Circuit
    Dabos, George
    Mourgias-Alexandris, George
    Totovic, Angelina
    Passalis, Nikolaos
    Tefas, Anastasios
    Pleros, Nikos
    2020 CONFERENCE ON LASERS AND ELECTRO-OPTICS (CLEO), 2020,
  • [8] Simplified Gating in Long Short-term Memory (LSTM) Recurrent Neural Networks
    Lu, Yuzhen
    Salem, Fathi M.
    2017 IEEE 60TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2017, : 1601 - 1604
  • [9] Evolutionary identification of a recurrent fuzzy neural network with enhanced memory capabilities
    Stavrakoudis, D. G.
    Papastamoulis, A. K.
    Theocharis, J. B.
    2008 3RD INTERNATIONAL WORKSHOP ON GENETIC AND EVOLVING FUZZY SYSTEMS, 2008, : 75 - 80
  • [10] Language Identification in Short Utterances Using Long Short-Term Memory (LSTM) Recurrent Neural Networks
    Zazo, Ruben
    Lozano-Diez, Alicia
    Gonzalez-Dominguez, Javier
    Toledano, Doroteo T.
    Gonzalez-Rodriguez, Joaquin
    PLOS ONE, 2016, 11 (01):