Exploiting foreign resources for DNN-based ASR

被引:9
|
作者
Motlicek, Petr [1 ]
Imseng, David [1 ]
Potard, Blaise [1 ]
Garner, Philip N. [1 ]
Himawan, Ivan [1 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
来源
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING | 2015年
关键词
Automatic speech recognition; Deep learning for speech; Acoustic model adaptation; Semi-supervised training; SPEECH; ALGORITHM; FEATURES;
D O I
10.1186/s13636-015-0058-5
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Manual transcription of audio databases for the development of automatic speech recognition (ASR) systems is a costly and time-consuming process. In the context of deriving acoustic models adapted to a specific application, or in low-resource scenarios, it is therefore essential to explore alternatives capable of improving speech recognition results. In this paper, we investigate the relevance of foreign data characteristics, in particular domain and language, when using this data as an auxiliary data source for training ASR acoustic models based on deep neural networks (DNNs). The acoustic models are evaluated on a challenging bilingual database within the scope of the MediaParl project. Experimental results suggest that in-language (but out-of-domain) data is more beneficial than in-domain (but out-of-language) data when employed in either supervised or semi-supervised training of DNNs. The best performing ASR system, an HMM/GMM acoustic model that exploits DNN as a discriminatively trained feature extractor outperforms the best performing HMM/DNN hybrid by about 5 % relative (in terms of WER). An accumulated relative gain with respect to the MFCC-HMM/GMM baseline is about 30 % WER.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 50 条
  • [21] DNN-based Intelligent Beamforming on a Programmable Metasurface
    Li S.
    Fu S.
    Xu F.
    Journal of Radars, 2021, 10 (02) : 259 - 266
  • [22] DNN-based Direction Finding by Time Modulation
    Kim, Donghyun
    Kim, Sung Hoe
    Cha, Seung Gook
    Yoon, Young Joong
    Jang, Byung-Jun
    2020 IEEE INTERNATIONAL SYMPOSIUM ON ANTENNAS AND PROPAGATION AND NORTH AMERICAN RADIO SCIENCE MEETING, 2020, : 439 - 440
  • [23] Attacking DNN-based Intrusion Detection Models
    Zhang, Xingwei
    Zheng, Xiaolong
    Wu, Desheng Dash
    IFAC PAPERSONLINE, 2020, 53 (05): : 415 - 419
  • [24] Integration of DNN based Speech Enhancement and ASR
    Astudillo, Ramon F.
    Correia, Joana
    Trancoso, Isabel
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3576 - 3580
  • [25] A DNN-based emotional speech synthesis by speaker adaptation
    Yang, Hongwu
    Zhang, Weizhao
    Zhi, Pengpeng
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 633 - 637
  • [26] Unsupervised Domain Adaptation for DNN-based Automated Harvesting
    Shkanaev, Aleksandr Yu
    Sholomov, Dmitry L.
    Nikolaev, Dmitry P.
    TWELFTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2019), 2020, 11433
  • [27] SPEAKER AND LANGUAGE FACTORIZATION IN DNN-BASED TTS SYNTHESIS
    Fan, Yuchen
    Qian, Yao
    Soong, Frank K.
    He, Lei
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5540 - 5544
  • [28] Analyzing Decision Polygons of DNN-based Classification Methods
    Kim, Jongyoung
    Woo, Seongyoun
    Lee, Wonjun
    Kim, Donghwan
    Lee, Chulhee
    ICINCO: PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS, 2020, : 346 - 351
  • [29] DNN-based Approach to Detect and Classify Pathological Voice
    Chuang, Zong-Ying
    Yu, Xiao-Tong
    Chen, Ji-Ying
    Hsu, Yi-Te
    Xu, Zhe-Zhuang
    Wang, Chi-Te
    Lin, Feng-Chuan
    Fang, Shih-Hau
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 5238 - 5241
  • [30] DNN-based Indoor Fingerprinting Localization with WiFi FTM
    Eberechukwu, Paulson
    Park, Hyunwoo
    Laoudias, Christos
    Horsmanheimo, Seppo
    Kim, Sunwoo
    2022 23RD IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2022), 2022, : 367 - 371