Personalization for BERT-based Discriminative Speech Recognition Rescoring

被引:0
|
作者
Kolehmainen, Jari [1 ]
Gu, Yile [1 ]
Gourav, Aditya [1 ]
Shivakumar, Prashanth Gurunath [1 ]
Gandhe, Ankur [1 ]
Rastrow, Ariya [1 ]
Bulyko, Ivan [1 ]
机构
[1] Amazon, Bellevue, WA 98004 USA
来源
关键词
speech recognition; rescoring; personalization; prompting; gazetteers;
D O I
10.21437/Interspeech.2023-990
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recognition of personalized content remains a challenge in end-to-end speech recognition. We explore three novel approaches that use personalized content in a neural rescoring step to improve recognition: gazetteers, prompting, and a cross-attention based encoder-decoder model. We use internal de-identified en-US data from interactions with a virtual voice assistant supplemented with personalized named entities to compare these approaches. On a test set with personalized named entities, we show that each of these approaches improves word error rate by over 10%, against a neural rescoring baseline. We also show that on this test set, natural language prompts can improve word error rate by 7% without any training and with a marginal loss in generalization. Overall, gazetteers were found to perform the best with a 10% improvement in word error rate (WER), while also improving WER on a general test set by 1%.
引用
收藏
页码:366 / 370
页数:5
相关论文
共 50 条
  • [1] RESCOREBERT: DISCRIMINATIVE SPEECH RECOGNITION RESCORING WITH BERT
    Xu, Liyan
    Gu, Yile
    Kolehmainen, Jari
    Khan, Haidar
    Gandhe, Ankur
    Rastrow, Ariya
    Stoleke, Andreas
    Bulyko, Ivan
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6117 - 6121
  • [2] BERT-based Semantic Model for Rescoring N-best Speech Recognition List
    Fohr, Dominique
    Illina, Irina
    [J]. INTERSPEECH 2021, 2021, : 1867 - 1871
  • [3] Learning to rank with BERT-based confidence models in ASR rescoring
    Wu, Ting-Wei
    Chen, I-Fan
    Gandhe, Ankur
    [J]. INTERSPEECH 2022, 2022, : 1651 - 1655
  • [4] INNOVATIVE BERT-BASED RERANKING LANGUAGE MODELS FOR SPEECH RECOGNITION
    Chiu, Shih-Hsuan
    Chen, Berlin
    [J]. 2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 266 - 271
  • [5] Scaling Laws for Discriminative Speech Recognition Rescoring Models
    Gu, Yile
    Shivakumar, Prashanth Gurunath
    Kolehmainen, Jari
    Gandhe, Ankur
    Rastrow, Ariya
    Bulyko, Ivan
    [J]. INTERSPEECH 2023, 2023, : 471 - 475
  • [6] BERT-based Ensemble Approaches for Hate Speech Detection
    Mnassri, Khouloud
    Rajapaksha, Praboda
    Farahbakhsh, Reza
    Crespi, Noel
    [J]. 2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 4649 - 4654
  • [7] Discriminative incorporation of explicitly trained tone models into lattice based rescoring for Mandarin speech recognition
    Huang, Hao
    Zhu, Jie
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 1541 - 1544
  • [8] Micron-BERT: BERT-based Facial Micro-Expression Recognition
    Nguyen, Xuan-Bac
    Duong, Chi Nhan
    Li, Xin
    Gauch, Susan
    Seo, Han-Seok
    Luu, Khoa
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 1482 - 1492
  • [9] ATTRIBUTE BASED LATTICE RESCORING IN SPONTANEOUS SPEECH RECOGNITION
    Chen, I-Fan
    Siniscalchi, Sabato Marco
    Lee, Chin-Hui
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [10] ENEX-FP: A BERT-Based Address Recognition Model
    Li, Min
    Liu, Zeyu
    Li, Gang
    Zhou, Mingle
    Han, Delong
    [J]. ELECTRONICS, 2023, 12 (01)