An End-to-end Approach to Language Identification in Short Utterances using Convolutional Neural Networks

被引:0
|
作者
Lozano-Diez, Alicia [1 ]
Zazo-Candil, Ruben [1 ]
Gonzalez-Dominguez, Javier [1 ]
Toledano, Doroteo T. [1 ]
Gonzalez-Rodriguez, Joaquin [1 ]
机构
[1] Univ Autonoma Madrid, ATVS Biometr Recognit Grp, Madrid, Spain
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this work, we propose an end-to-end approach to the language identification (LID) problem based on Convolutional Deep Neural Networks (CDNNs). The use of CDNNs is mainly motivated by the ability they have shown when modeling speech signals, and their relatively low-cost with respect to other deep architectures in terms of number of free parameters. We evaluate different configurations in a subset of 8 languages within the NIST Language Recognition Evaluation 2009 Voice of America (VOA) dataset, for the task of short test durations (segments up to 3 seconds of speech). The proposed CDNN-based systems achieve comparable performances to our baseline i-vector system, while reducing drastically the number of parameters to tune (at least 100 times fewer parameters). Then, we combine these CDNN-based systems and the i-vector baseline with a simple fusion at score level. This combination outperforms our best standalone system (up to 11% of relative improvement in terms of EER).
引用
收藏
页码:403 / 407
页数:5
相关论文
共 50 条
  • [21] End-to-end face parsing via interlinked convolutional neural networks
    Yin, Zi
    Yiu, Valentin
    Hu, Xiaolin
    Tang, Liang
    COGNITIVE NEURODYNAMICS, 2021, 15 (01) : 169 - 179
  • [22] End-to-End Exposure Fusion Using Convolutional Neural Network
    Wang, Jinhua
    Wang, Weiqiang
    Xu, Guangmei
    Liu, Hongzhe
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (02): : 560 - 563
  • [23] End-to-end Language Identification using NetFV and NetVLAD
    Chen, Jinkun
    Cai, Weicheng
    Cai, Danwei
    Cai, Zexin
    Zhong, Haibin
    Li, Ming
    2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 319 - 323
  • [24] An End-to-End Approach for Seam Carving Detection Using Deep Neural Networks
    Moreira, Thierry P.
    Santana, Marcos Cleison S.
    Passos, Leandro A.
    Papa, Joao Paulo
    da Costa, Kelton Augusto P.
    PATTERN RECOGNITION AND IMAGE ANALYSIS (IBPRIA 2022), 2022, 13256 : 447 - 457
  • [25] Two-Microphone End-to-End Speaker Joint Identification and Localization Via Convolutional Neural Networks
    Salvati, Daniele
    Drioli, Carlo
    Foresti, Gian Luca
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [26] End-to-end Prediction of Driver Intention using 3D Convolutional Neural Networks
    Gebert, Patrick
    Roitberg, Alina
    Haurilet, Monica
    Stiefelhagen, Rainer
    2019 30TH IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV19), 2019, : 969 - 974
  • [27] End-to-end learning to predict survival in patients with gastric cancer using convolutional neural networks
    Meier, A.
    Nekolla, K.
    Earle, S.
    Hewitt, L.
    Aoyama, T.
    Yoshikawa, T.
    Schmidt, G.
    Huss, R.
    Grabsch, H. I.
    ANNALS OF ONCOLOGY, 2018, 29 : 23 - 23
  • [28] End-to-end Identification of Autoregressive with Exogenous Input (ARX) Models Using Neural Networks
    Dong, Aoxiang
    Starr, Andrew
    Zhao, Yifan
    MACHINE INTELLIGENCE RESEARCH, 2025, 22 (01) : 117 - 130
  • [29] End-to-end Convolutional Neural Networks for Sound Event Detection in Urban Environments
    Zinemanas, Pablo
    Cancela, Pablo
    Rocamora, Martin
    PROCEEDINGS OF THE 24TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT), 2019, : 533 - 539
  • [30] Leukocyte Segmentation via End-to-End Learning of Deep Convolutional Neural Networks
    Lu, Yan
    Fan, Haoyi
    Li, Zuoyong
    INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING: VISUAL DATA ENGINEERING, PT I, 2019, 11935 : 191 - 200