Entropy-Based Dynamic Rescoring with Language Model in E2E ASR Systems

被引:1
|
作者
Gong, Zhuo [1 ]
Saito, Daisuke [1 ]
Minematsu, Nobuaki [1 ]
机构
[1] Univ Tokyo, Grad Sch Engn, Dept Elect Engn & Informat Syst, Tokyo 1138656, Japan
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 19期
关键词
speech recognition; language model integration; shallow fusion; beam search; model confidence;
D O I
10.3390/app12199690
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Language models (LM) have played crucial roles in automatic speech recognition (ASR), whether as an essential part of a conventional ASR system composed of an acoustic model and LM, or as an integrated model to enhance the performance of novel end-to-end ASR systems. With the development of machine learning and deep learning, language modeling has made great progress in natural language processing applications. In recent years, efforts have been made to leverage the advantages of novel LM to ASR. The most common way to apply an integration is still shallow fusion because it can be easily implemented by zero-overhead while obtaining significant improvement. Our method can further enhance the applicability of shallow fusion without hyperparameter tuning while maintaining similar performance.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Integration of WFST Language Model in Pre-trained Korean E2E ASR Model
    Oh, Junseok
    Cho, Eunsoo
    Kim, Ji-Hwan
    [J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2024, 18 (06): : 1693 - 1706
  • [2] A CIF-Based Speech Segmentation Method for Streaming E2E ASR
    Shu, Yuchun
    Luo, Haoneng
    Zhang, Shiliang
    Wang, Longbiao
    Dang, Jianwu
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 344 - 348
  • [3] A Policy-based Approach to the SpecAugment Method for Low Resource E2E ASR
    Li, Rui
    Ma, Guodong
    Zhao, Dexin
    Zeng, Ranran
    Li, Xiaoyu
    Huang, Hao
    [J]. PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 630 - 635
  • [4] Improving Tail Performance of a Deliberation E2E ASR Model Using a Large Text Corpus
    Peyser, Cal
    Mavandadi, Sepand
    Sainath, Tara N.
    Apfel, James
    Pang, Ruoming
    Kumar, Shankar
    [J]. INTERSPEECH 2020, 2020, : 4921 - 4925
  • [5] Attacking Paper-Based E2E Voting Systems
    Kelsey, John
    Regenscheid, Andrew
    Moran, Tal
    Chaum, David
    [J]. TOWARDS TRUSTWORTHY ELECTIONS: NEW DIRECTIONS IN ELECTRONIC VOTING, 2010, 6000 : 370 - +
  • [6] Simulation-based analysis of E2E voting systems
    de Marneffe, Olivier
    Pereira, Olivier
    Quisquater, Jean-Jacques
    [J]. E-VOTING AND IDENTITY, 2007, 4896 : 137 - 149
  • [7] E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR
    Huang, W. Ronny
    Chang, Shuo-yiin
    Rybach, David
    Prabhavalkar, Rohit
    Sainath, Tara N.
    Allauzen, Cyril
    Peyser, Cal
    Lu, Zhiyun
    [J]. INTERSPEECH 2022, 2022, : 4995 - 4999
  • [8] Dual Script E2E Framework for Multilingual and Code-Switching ASR
    Kumar, Mari Ganesh
    Kuriakose, Jom
    Thyagachandran, Anand
    Kumar, Arun A.
    Seth, Ashish
    Prasad, Lodagala V. S. V. Durga
    Jaiswal, Saish
    Prakash, Anusha
    Murthy, Hema A.
    [J]. INTERSPEECH 2021, 2021, : 2441 - 2445
  • [9] Vulnerability studies of E2E voting systems
    Rura, Lauretha
    Issac, Biju
    Haldar, Manas
    [J]. Lecture Notes in Electrical Engineering, 2015, 313 : 223 - 231
  • [10] Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems
    Udagawa, Takuma
    Suzuki, Masayuki
    Kurata, Gakuto
    Itoh, Nobuyasu
    Saon, George
    [J]. INTERSPEECH 2022, 2022, : 3919 - 3923