An End-to-end Approach to Language Identification in Short Utterances using Convolutional Neural Networks

被引：0

作者：

Lozano-Diez, Alicia ^{[1
]}

Zazo-Candil, Ruben ^{[1
]}

Gonzalez-Dominguez, Javier ^{[1
]}

Toledano, Doroteo T. ^{[1
]}

Gonzalez-Rodriguez, Joaquin ^{[1
]}

机构：

[1] Univ Autonoma Madrid, ATVS Biometr Recognit Grp, Madrid, Spain

来源：

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this work, we propose an end-to-end approach to the language identification (LID) problem based on Convolutional Deep Neural Networks (CDNNs). The use of CDNNs is mainly motivated by the ability they have shown when modeling speech signals, and their relatively low-cost with respect to other deep architectures in terms of number of free parameters. We evaluate different configurations in a subset of 8 languages within the NIST Language Recognition Evaluation 2009 Voice of America (VOA) dataset, for the task of short test durations (segments up to 3 seconds of speech). The proposed CDNN-based systems achieve comparable performances to our baseline i-vector system, while reducing drastically the number of parameters to tune (at least 100 times fewer parameters). Then, we combine these CDNN-based systems and the i-vector baseline with a simple fusion at score level. This combination outperforms our best standalone system (up to 11% of relative improvement in terms of EER).

引用

页码：403 / 407

页数：5

共 50 条

[11] An End-to-End Compression Framework Based on Convolutional Neural Networks
Tao, Wen
Jiang, Feng
Zhang, Shengping
Ren, Jie
Shi, Wuzhen
Zuo, Wangmeng
Guo, Xun
Zhao, Debin
2017 DATA COMPRESSION CONFERENCE (DCC), 2017, : 463 - 463
[12] Residual convolutional neural network with attentive feature pooling for end-to-end language identification from short-duration speech
Monteiro, Joao
Alam, Jahangir
Falk, Tiago H.
COMPUTER SPEECH AND LANGUAGE, 2019, 58 : 364 - 376
[13] EXPLORING END-TO-END ATTENTION-BASED NEURAL NETWORKS FOR NATIVE LANGUAGE IDENTIFICATION
Ubale, Rutuja
Qian, Yao
Evanini, Keelan
2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 84 - 91
[14] End-to-end recognition of slab identification numbers using a deep convolutional neural network
Lee, Sang Jun
Yun, Jong Pil
Koo, Gyogwon
Kim, Sang Woo
KNOWLEDGE-BASED SYSTEMS, 2017, 132 : 1 - 10
[15] END-TO-END PHOTOPLETHYSMOGRAPHY (PPG) BASED BIOMETRIC AUTHENTICATION BY USING CONVOLUTIONAL NEURAL NETWORKS
Luque, Jordi
Cortes, Guillem
Segura, Carlos
Maravilla, Alexandre
Esteban, Javier
Fabregat, Joan
2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 538 - 542
[16] Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks
Zhang, Ying
Pezeshki, Mohammad
Brakel, Philemon
Zhang, Saizheng
Laurent, Cesar
Bengio, Yoshua
Courville, Aaron
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 410 - 414
[17] Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks
Li, Hui
Wang, Peng
Shen, Chunhua
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5248 - 5256
[18] Convolutional Dictionary Learning by End-To-End Training of Iterative Neural Networks
Kofler, Andreas
Wald, Christian
Schaeffter, Tobias
Haltmeier, Markus
Kolbitsch, Christoph
2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 1213 - 1217
[19] End-to-end face parsing via interlinked convolutional neural networks
Zi Yin
Valentin Yiu
Xiaolin Hu
Liang Tang
Cognitive Neurodynamics, 2021, 15 : 169 - 179
[20] Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition
Parcollet, Titouan
Zhang, Ying
Morchid, Mohamed
Trabelsi, Chiheb
Linares, Georges
De Mori, Renato
Bengio, Yoshua
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 22 - 26

← 1 2 3 4 5 →