Combining in-domain and out-of-domain speech data for automatic recognition of disordered speech

被引:0
|
作者
Christensen, H. [1 ]
Aniol, M. B. [2 ]
Bell, P. [2 ]
Green, P. [1 ]
Hain, T. [1 ]
King, S. [2 ]
Swietojanski, P. [2 ]
机构
[1] Univ Sheffield, Dept Comp Sci, Sheffield S1 4DP, S Yorkshire, England
[2] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh EH8 9AB, Midlothian, Scotland
关键词
Speech recognition; Tandem features; Deep belief neural network; Disordered speech;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently there has been increasing interest in ways of using out-of-domain (OOD) data to improve automatic speech recognition performance in domains where only limited data is available. This paper focuses on one such domain, namely that of disordered speech for which only very small databases exist, but where normal speech can be considered ODD. Standard approaches for handling small data domains use adaptation from OOD models into the target domain, but here we investigate an alternative approach with its focus on the feature extraction stage: OOD data is used to train feature-generating deep belief neural networks. Using AMI meeting and TED talk datasets, we investigate various tandem-based speaker independent systems as well as maximum a posteriori adapted speaker dependent systems. Results on the UAspeech isolated word task of disordered speech are very promising with our overall best system (using a combination of AMI and TED data) giving a correctness of 62.5%; an increase of 15% on previously best published results based on conventional model adaptation. We show that the relative benefit of using OOD data varies considerably from speaker to speaker and is only loosely correlated with the severity of a speaker's impairments.
引用
收藏
页码:3609 / 3612
页数:4
相关论文
共 50 条
  • [1] Improving Children's Speech Recognition through Out-of-Domain Data Augmentation
    Fainberg, Joachim
    Bell, Peter
    Lincoln, Mike
    Renals, Steve
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1598 - 1602
  • [2] IMPROVING CONFIDENCE ESTIMATION ON OUT-OF-DOMAIN DATA FOR END-TO-END SPEECH RECOGNITION
    Li, Qiujia
    Zhang, Yu
    Qiu, David
    He, Yanzhang
    Cao, Liangliang
    Woodland, Philip C.
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6537 - 6541
  • [3] GAN-BASED OUT-OF-DOMAIN DETECTION USING BOTH IN-DOMAIN AND OUT-OF-DOMAIN SAMPLES
    Liang, Chaojie
    Huang, Peijie
    Lai, Wenbin
    Ruan, Ziheng
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7663 - 7667
  • [4] Using Representation Learning and Out-of-domain Data for a Paralinguistic Speech Task
    Milde, Benjamin
    Biemann, Chris
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 904 - 908
  • [5] Towards Textual Out-of-Domain Detection Without In-Domain Labels
    Jin, Di
    Gao, Shuyang
    Kim, Seokhwan
    Liu, Yang
    Hakkani-Tur, Dilek
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1386 - 1395
  • [6] Improving child speech disorder assessment by incorporating out-of-domain adult speech
    Smith, Daniel
    Sneddon, Alex
    Ward, Lauren
    Duenser, Andreas
    Freyne, Jill
    Silvera-Tawil, David
    Morgans, Angela
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2690 - 2694
  • [7] In-domain versus out-of-domain transfer learning for document layout analysis
    De Nardin, Axel
    Zottin, Silvia
    Piciarelli, Claudio
    Foresti, Gian Luca
    Colombi, Emanuela
    [J]. INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2024,
  • [8] NOISE-ROBUST SPEECH RECOGNITION WITH 10 MINUTES UNPARALLELED IN-DOMAIN DATA
    Chen, Chen
    Hou, Nana
    Hu, Yuchen
    Shirol, Shashank
    Chng, Eng Siong
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4298 - 4302
  • [9] In-domain versus out-of-domain transfer learning in plankton image classification
    Andrea Maracani
    Vito Paolo Pastore
    Lorenzo Natale
    Lorenzo Rosasco
    Francesca Odone
    [J]. Scientific Reports, 13
  • [10] In-domain versus out-of-domain transfer learning in plankton image classification
    Maracani, Andrea
    Pastore, Vito Paolo
    Natale, Lorenzo
    Rosasco, Lorenzo
    Odone, Francesca
    [J]. SCIENTIFIC REPORTS, 2023, 13 (01)