Exploiting foreign resources for DNN-based ASR

被引:9
|
作者
Motlicek, Petr [1 ]
Imseng, David [1 ]
Potard, Blaise [1 ]
Garner, Philip N. [1 ]
Himawan, Ivan [1 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
关键词
Automatic speech recognition; Deep learning for speech; Acoustic model adaptation; Semi-supervised training; SPEECH; ALGORITHM; FEATURES;
D O I
10.1186/s13636-015-0058-5
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Manual transcription of audio databases for the development of automatic speech recognition (ASR) systems is a costly and time-consuming process. In the context of deriving acoustic models adapted to a specific application, or in low-resource scenarios, it is therefore essential to explore alternatives capable of improving speech recognition results. In this paper, we investigate the relevance of foreign data characteristics, in particular domain and language, when using this data as an auxiliary data source for training ASR acoustic models based on deep neural networks (DNNs). The acoustic models are evaluated on a challenging bilingual database within the scope of the MediaParl project. Experimental results suggest that in-language (but out-of-domain) data is more beneficial than in-domain (but out-of-language) data when employed in either supervised or semi-supervised training of DNNs. The best performing ASR system, an HMM/GMM acoustic model that exploits DNN as a discriminatively trained feature extractor outperforms the best performing HMM/DNN hybrid by about 5 % relative (in terms of WER). An accumulated relative gain with respect to the MFCC-HMM/GMM baseline is about 30 % WER.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 50 条
  • [41] DNN-Based Speech Synthesis for Arabic: Modelling and Evaluation
    Houidhek, Amal
    Colotte, Vincent
    Mnasri, Zied
    Jouvet, Denis
    STATISTICAL LANGUAGE AND SPEECH PROCESSING, SLSP 2018, 2018, 11171 : 9 - 20
  • [42] DNN-based Models for Speaker Age and Gender Classification
    Qawaqneh, Zakariya
    Abu Mallouh, Arafat
    Barkana, Buket D.
    PROCEEDINGS OF THE 10TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES, VOL 4: BIOSIGNALS, 2017, : 106 - 111
  • [43] Robust DNN-Based Recovery of Wideband Spectrum Signals
    Zhang, Xingjian
    Ma, Yuan
    Liu, Yaohui
    Wu, Shaohua
    Jiao, Jian
    Gao, Yue
    Zhang, Qinyu
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2023, 12 (10) : 1712 - 1715
  • [44] Study of DNN-Based Ragweed Detection from Drones
    Lechner, Martin
    Steindl, Lukas
    Jantsch, Axel
    EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION, SAMOS 2022, 2022, 13511 : 187 - 199
  • [45] AMAS: A DNN-Based Automatic Manhours Approval System
    Jia, Yunfei
    Lu, Hao
    Hu, Debin
    2022 INTERNATIONAL CONFERENCE ON MECHANICAL, AUTOMATION AND ELECTRICAL ENGINEERING, CMAEE, 2022, : 20 - 24
  • [46] DNN-Based Duration Modeling for Synthesizing Short Sentences
    Nagy, Peter
    Nemeth, Geza
    Speech and Computer, 2016, 9811 : 254 - 261
  • [47] DNN-Based Cepstral Excitation Manipulation for Speech Enhancement
    Elshamy, Samy
    Fingscheidt, Tim
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (11) : 1803 - 1814
  • [48] DVQShare: An Analytics System for DNN-based Video Queries
    Fu, Hao
    Tang, Shanjiang
    Yu, Ce
    Li, Yusen
    Sun, Jizhou
    Liu, Yanjie
    21ST IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2021), 2021, : 166 - 175
  • [49] A DNN-Based Learning Framework for Continuous Movements Segmentation
    Xiang, Tian-yu
    Zhou, Xiao-Hu
    Xie, Xiao-Liang
    Liu, Shi-Qi
    Feng, Zhen-Qiu
    Gui, Mei-Jiang
    Li, Hao
    Hou, Zeng-Guang
    NEURAL INFORMATION PROCESSING, ICONIP 2023, PT III, 2024, 14449 : 399 - 410
  • [50] Suitability of DNN-based vessel segmentation for SIRT planning
    Kock, Farina
    Thielke, Felix
    Abolmaali, Nasreddin
    Meine, Hans
    Schenk, Andrea
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2023, 19 (2) : 233 - 240