Deep transfer learning for automatic speech recognition: Towards better generalization

被引:20
|
作者
Kheddar, Hamza [1 ]
Himeur, Yassine [2 ,3 ]
Al-Maadeed, Somaya [3 ]
Amira, Abbes [4 ,5 ]
Bensaali, Faycal [6 ]
机构
[1] Univ Medea, Elect Engn Dept, LSEA Lab, Medea, Algeria
[2] Univ Dubai, Coll Engn & Informat Technol, Dubai, U Arab Emirates
[3] Qatar Univ, Comp Sci & Engn, Doha, Qatar
[4] Univ Sharjah, Dept Comp Sci, Sharjah, U Arab Emirates
[5] De Montfort Univ, Inst Artificial Intelligence, Leicester, England
[6] Qatar Univ, Dept Elect Engn, Doha, Qatar
关键词
Automatic speech recognition; Deep transfer learning; Fine-tuning; Domain adaptation; Models fusion; Large language model; semantic knowledge. Acoustic model (AM) processing includes; DOMAIN ADAPTATION; NEURAL-NETWORK; MODELS; CLASSIFICATION; LIGHTWEIGHT; MULTITASK; ATTACKS; SEARCH; JOINT; TEXT;
D O I
10.1016/j.knosys.2023.110851
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic speech recognition (ASR) has recently become an important challenge when using deep learning (DL). It requires large-scale training datasets and high computational and storage resources. Moreover, DL techniques and machine learning (ML) approaches in general, hypothesize that training and testing data come from the same domain, with the same input feature space and data distribution characteristics. This assumption, however, is not applicable in some real-world artificial intelligence (AI) applications. Moreover, there are situations where gathering real data is challenging, expensive, or rarely occurring, which cannot meet the data requirements of DL models. deep transfer learning (DTL) has been introduced to overcome these issues, which helps develop high-performing models using real datasets that are small or slightly different but related to the training data. This paper presents a comprehensive survey of DTL-based ASR frameworks to shed light on the latest developments and helps academics and professionals understand current challenges. Specifically, after presenting the DTL background, a well-designed taxonomy is adopted to inform the state-of-the-art. A critical analysis is then conducted to identify the limitations and advantages of each framework. Moving on, a comparative study is introduced to highlight the current challenges before deriving opportunities for future research.(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页数:29
相关论文
共 50 条
  • [1] Video surveillance using deep transfer learning and deep domain adaptation: Towards better generalization
    Himeur, Yassine
    Al-Maadeed, Somaya
    Kheddar, Hamza
    Al-Maadeed, Noor
    Abualsaud, Khalid
    Mohamed, Amr
    Khattab, Tamer
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 119
  • [2] Transfer Learning for Automatic Speech Recognition Systems
    Asefisaray, Behnam
    Haznedaroglu, Ali
    Erden, Mustafa
    Arslan, Levent M.
    [J]. 2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
  • [3] Transfer Learning in Automatic Speech Recognition for Serbian
    Popovic, Branislav
    Pakoci, Edvin
    Pekar, Darko
    [J]. 2019 27TH TELECOMMUNICATIONS FORUM (TELFOR 2019), 2019, : 309 - 312
  • [4] DISTRIBUTED DEEP LEARNING STRATEGIES FOR AUTOMATIC SPEECH RECOGNITION
    Zhang, Wei
    Cui, Xiaodong
    Finkler, Ulrich
    Kingsbury, Brian
    Saon, George
    Kung, David
    Picheny, Michael
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5706 - 5710
  • [5] Multilingual Transfer Learning for Children Automatic Speech Recognition
    Rolland, Thomas
    Abad, Alberto
    Cucchiarini, Catia
    Strik, Helmer
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 7314 - 7320
  • [6] TOWARDS STRUCTURED DEEP NEURAL NETWORK FOR AUTOMATIC SPEECH RECOGNITION
    Liao, Yi-Hsiu
    Lee, Hung-yi
    Lee, Lin-shan
    [J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 137 - 144
  • [7] Towards better generalization in quadrotor landing using deep reinforcement learning
    Jiawei Wang
    Teng Wang
    Zichen He
    Wenzhe Cai
    Changyin Sun
    [J]. Applied Intelligence, 2023, 53 : 6195 - 6213
  • [8] Towards better generalization in quadrotor landing using deep reinforcement learning
    Wang, Jiawei
    Wang, Teng
    He, Zichen
    Cai, Wenzhe
    Sun, Changyin
    [J]. APPLIED INTELLIGENCE, 2023, 53 (06) : 6195 - 6213
  • [9] Improving Deep Learning based Automatic Speech Recognition for Gujarati
    Raval, Deepang
    Pathak, Vyom
    Patel, Muktan
    Bhatt, Brijesh
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (03)
  • [10] Study of Deep Learning and CMU Sphinx in Automatic Speech Recognition
    Dhankar, Abhishek
    [J]. 2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 2296 - 2301