Combining in-domain and out-of-domain speech data for automatic recognition of disordered speech

被引:0
|
作者
Christensen, H. [1 ]
Aniol, M. B. [2 ]
Bell, P. [2 ]
Green, P. [1 ]
Hain, T. [1 ]
King, S. [2 ]
Swietojanski, P. [2 ]
机构
[1] Univ Sheffield, Dept Comp Sci, Sheffield S1 4DP, S Yorkshire, England
[2] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh EH8 9AB, Midlothian, Scotland
关键词
Speech recognition; Tandem features; Deep belief neural network; Disordered speech;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently there has been increasing interest in ways of using out-of-domain (OOD) data to improve automatic speech recognition performance in domains where only limited data is available. This paper focuses on one such domain, namely that of disordered speech for which only very small databases exist, but where normal speech can be considered ODD. Standard approaches for handling small data domains use adaptation from OOD models into the target domain, but here we investigate an alternative approach with its focus on the feature extraction stage: OOD data is used to train feature-generating deep belief neural networks. Using AMI meeting and TED talk datasets, we investigate various tandem-based speaker independent systems as well as maximum a posteriori adapted speaker dependent systems. Results on the UAspeech isolated word task of disordered speech are very promising with our overall best system (using a combination of AMI and TED data) giving a correctness of 62.5%; an increase of 15% on previously best published results based on conventional model adaptation. We show that the relative benefit of using OOD data varies considerably from speaker to speaker and is only loosely correlated with the severity of a speaker's impairments.
引用
收藏
页码:3609 / 3612
页数:4
相关论文
共 50 条
  • [21] Verification of speech recognition results incorporating in-domain confidence and discourse coherence measures
    Lane, IR
    Kawahara, T
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (03): : 931 - 938
  • [22] Neural sentence embedding using only in-domain sentences for out-of-domain sentence detection in dialog systems
    Ryu, Seonghan
    Kim, Seokhwan
    Choi, Junhwi
    Yu, Hwanjo
    Lee, Gary Geunbae
    [J]. PATTERN RECOGNITION LETTERS, 2017, 88 : 26 - 32
  • [23] Speech corpus recycling for acoustic cross-domain environments for automatic speech recognition
    Ichikawa, Osamu
    Rennie, Steven J.
    Fukuda, Takashi
    Willett, Daniel
    [J]. ACOUSTICAL SCIENCE AND TECHNOLOGY, 2016, 37 (02) : 55 - 65
  • [24] Automatic Speech Recognition Adaptation to the IoT Domain Dialogue System
    Zembrzuski, Maciej
    Jeon, Heesik
    Marhula, Joanna
    Beksa, Katarzyna
    Sikorski, Szymon
    Latkowski, Tomasz
    Bujnowski, Pawel
    [J]. FOUNDATIONS OF INTELLIGENT SYSTEMS, ISMIS 2017, 2017, 10352 : 215 - 226
  • [25] Domain Generalization for Language-Independent Automatic Speech Recognition
    Gao, Heting
    Ni, Junrui
    Zhang, Yang
    Qian, Kaizhi
    Chang, Shiyu
    Hasegawa-Johnson, Mark
    [J]. FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2022, 5
  • [26] Written-Domain Language Modeling for Automatic Speech Recognition
    Sak, Hasim
    Sung, Yun-hsuan
    Beaufays, Francoise
    Allauzen, Cyril
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 675 - 679
  • [27] Using out-of-domain data to improve on-domain language models
    Iyer, R
    Ostendorf, M
    Gish, H
    [J]. IEEE SIGNAL PROCESSING LETTERS, 1997, 4 (08) : 221 - 223
  • [28] The GENEREG Corpus for Gene Expression Regulation Events-An Overview of the Corpus and its In-Domain and Out-of-Domain Interoperability
    Buyko, Ekaterina
    Beisswanger, Elena
    Hahn, Udo
    [J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 2662 - 2666
  • [29] Spectral-domain speech enhancement for speech recognition
    You, Chang Huai
    Ma, Bin
    [J]. SPEECH COMMUNICATION, 2017, 94 : 30 - 41
  • [30] Glioma subtype classification from histopathological images using in-domain and out-of-domain transfer learning: An experimental study
    Despotovic, Vladimir
    Kim, Sang-Yoon
    Hau, Ann-Christin
    Kakoichankava, Aliaksandra
    Klamminger, Gilbert Georg
    Borgmann, Felix Bruno Kleine
    Frauenknecht, Katrin B. M.
    Mittelbronn, Michel
    Nazarov, Petr, V
    [J]. HELIYON, 2024, 10 (05)