Enhancing Automatic Speech Recognition With Personalized Models: Improving Accuracy Through Individualized Fine-Tuning

被引:0
|
作者
Brydinskyi, Vitalii [1 ]
Sabodashko, Dmytro [1 ]
Khoma, Yuriy [1 ]
Podpora, Michal [2 ]
Konovalov, Alexander [3 ]
Khoma, Volodymyr [4 ]
机构
[1] Lviv Polytech Natl Univ, Inst Comp Technol Automat & Metrol, UA-79013 Lvov, Ukraine
[2] Opole Univ Technol, Dept Comp Sci, PL-45758 Opole, Poland
[3] Vidby AG, CH-6343 Risch Rotkreuz, Switzerland
[4] Opole Univ Technol, Dept Control Engn, PL-45758 Opole, Poland
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Automatic speech recognition; Transformers; Natural language processing; speech processing; natural language processing; sound recognition;
D O I
10.1109/ACCESS.2024.3443811
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Automatic speech recognition (ASR) systems have become increasingly popular in recent years due to their ability to convert spoken language into text. Nonetheless, despite their widespread use, existing speaker-independent ASR systems frequently encounter challenges related to variations in speaking styles, accents, and vocal characteristics, leading to potential recognition inaccuracies. This study delves into the feasibility of personalized ASR systems that adapt to the unique voice attributes of individual speakers, thereby enhancing recognition accuracy. It provides an overview of our methodology, focusing on the design, development, and evaluation of both speaker-independent and personalized ASR systems. The dataset used included diverse speakers selected from three extensive datasets: TedLIUM-3, CommonVoice, and GoogleVoice, demonstrating the capability of our methodology to accommodate various accents and challenges of both natural and synthetic voices. In terms of signal classification and interpretation, the personalized model eclipsed the speaker-independent variant, registering an enhancement of up to similar to 3% for natural voices and similar to 10% for synthetic voices in recognition accuracy for individual speakers. Our findings demonstrate that personalized ASR systems can significantly improve the accuracy of speech recognition for individual speakers and highlight the importance of adapting ASR models to individual voices.
引用
收藏
页码:116649 / 116656
页数:8
相关论文
共 50 条
  • [31] Boosting Diagnostic Accuracy of Osteoporosis in Knee Radiograph Through Fine-Tuning CNN
    Kumar, Saumya
    Goswami, Puneet
    Batra, Shivani
    BIG DATA ANALYTICS IN ASTRONOMY, SCIENCE, AND ENGINEERING, BDA 2023, 2024, 14516 : 97 - 109
  • [32] Replay to Remember: Continual Layer-Specific Fine-Tuning for German Speech Recognition
    Rosin, Theresa Pekarek
    Wermter, Stefan
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VII, 2023, 14260 : 489 - 500
  • [33] Effective Fine-tuning Method for Tibetan Low-resource Dialect Speech Recognition
    Yang, Jiahao
    Wei, Jianguo
    Khysru, Kuntharrgyal
    Xu, Junhai
    Lu, Wenhuan
    Ke, Wenjun
    Yang, Xiaokang
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 960 - 965
  • [34] Machine Learning-Based Fine-Tuning for Accented Speech Recognition in Trinidad and Tobago
    De Lancy, Shaquille Jared
    Khan, Kona
    2024 IEEE 3RD CONFERENCE ON INFORMATION TECHNOLOGY AND DATA SCIENCE, CITDS 2024, 2024, : 134 - 139
  • [35] ScoutWav: Two-Step Fine-Tuning on Self-Supervised Automatic Speech Recognition for Low-Resource Environments
    Fatehi, Kavan
    Torres, Mercedes Torres
    Kucukyilmaz, Ayse
    INTERSPEECH 2022, 2022, : 3523 - 3527
  • [36] Fine-tuning established morphometric models through citizen science data
    Biskis, Veronika N.
    Townsend, Kathy A.
    Morgan, David L.
    Lear, Karissa O.
    Holmes, Bonnie J.
    Wueringer, Barbara E.
    CONSERVATION SCIENCE AND PRACTICE, 2025, 7 (03)
  • [37] GO BEYOND PLAIN FINE-TUNING: IMPROVING PRETRAINED MODELS FOR SOCIAL COMMONSENSE
    Chang, Ting-Yun
    Liu, Yang
    Gopalakrishnan, Karthik
    Hedayatnia, Behnam
    Zhou, Pei
    Hakkani-Tur, Dilek
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 1028 - 1035
  • [38] Fine-Tuning Florigen Increases Field Yield Through Improving Photosynthesis in Soybean
    Xu, Kun
    Zhang, Xiao-Mei
    Chen, Haifeng
    Zhang, Chanjuan
    Zhu, Jinlong
    Cheng, Zhiyuan
    Huang, Penghui
    Zhou, Xinan
    Miao, Yuchen
    Feng, Xianzhong
    Fu, Yong-Fu
    FRONTIERS IN PLANT SCIENCE, 2021, 12
  • [39] Improving unbalanced image classification through fine-tuning method of reinforcement learning
    Wang, Jin-Qiang
    Guo, Lan
    Jiang, Yuanbo
    Zhang, Shengjie
    Zhou, Qingguo
    APPLIED SOFT COMPUTING, 2024, 163
  • [40] Improving Right Whale Recognition by Fine-tuning Alignment and Using Wide Localization Network
    Kabani, AbdulWahab
    El-Sakka, Mahmoud R.
    2017 IEEE 30TH CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2017,