GEOGRAPHIC LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION

被引:0
|
作者
Xiao, Xiaoqiang [1 ]
Chen, Hong [1 ]
Zylak, Mark [1 ]
Sosa, Daniela [1 ]
Desu, Suma [1 ]
Krishnamoorthy, Mahesh [1 ]
Liu, Daben [1 ]
Paulik, Matthias [1 ]
Zhang, Yuchen [1 ]
机构
[1] Apple Inc, Cupertino, CA 95014 USA
关键词
speech recognition; language model; Geo-LM; class LM; Combine Statistical Area; MOBILE; VOICE;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose improving automatic speech recognition (ASR) accuracy for local points of interest (POI) by leveraging a geo-specific language model (Geo-LM). Geographic regions are defined according to U.S. Census Bureau Combined Statistical Areas. Depending on the user's associated geographic region, for each user a class based Geo-LM is constructed dynamically within a difference-LM based weighted finite state transducer (WFST) system. The benefits of this approach include: improved accuracy for local POI name recognition, flexibility in training, and efficient LM construction at runtime. Our experiments show that the proposed Geo-LM achieves an average of over 18% relative word error rate (WER) reduction on the tasks of local POI search, with no degradation to the general accuracy and very limited latency increase, compared to the baseline nationwide general LM. In addition to accuracy improvement, we also discuss optimization of runtime efficiency.
引用
收藏
页码:6124 / 6128
页数:5
相关论文
共 50 条
  • [1] JOINT LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING
    Bayer, Ali Orkan
    Riccardi, Giuseppe
    [J]. 2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 199 - 203
  • [2] Morpholexical and Discriminative Language Models for Turkish Automatic Speech Recognition
    Sak, Hasim
    Saraclar, Murat
    Gungor, Tunga
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (08): : 2341 - 2351
  • [3] Automatic Speech Recognition for Irish: testing lexicons and language models
    Qian, Mengjie
    Berthelsen, Harald
    Lonergan, Liam
    Murphy, Andy
    O'Neill, Claire
    Chiarain, Neasa Ni
    Gobl, Christer
    Chasaide, Ailbhe Ni
    [J]. 2022 33RD IRISH SIGNALS AND SYSTEMS CONFERENCE (ISSC), 2022,
  • [4] Neural Error Corrective Language Models for Automatic Speech Recognition
    Tanaka, Tomohiro
    Masumura, Ryo
    Masataki, Hirokazu
    Aono, Yushi
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 401 - 405
  • [5] Web-based possibilistic language models for automatic speech recognition
    Oger, Stanislas
    Linares, Georges
    [J]. COMPUTER SPEECH AND LANGUAGE, 2014, 28 (04): : 923 - 939
  • [6] Improving Automatic Speech Recognition with Dialect-Specific Language Models
    Gothi, Raj
    Rao, Preeti
    [J]. SPEECH AND COMPUTER, SPECOM 2023, PT I, 2023, 14338 : 57 - 67
  • [7] BIDIRECTIONAL RECURRENT NEURAL NETWORK LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION
    Arisoy, Ebru
    Sethy, Abhinav
    Ramabhadran, Bhuvana
    Chen, Stanley
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5421 - 5425
  • [8] Comparison Of Language Models Trained On Written Texts And Speech Transcripts In The Context Of Automatic Speech Recognition
    Dziadzio, Sebastian
    Nabozny, Aleksandra
    Smywinski-Pohl, Aleksander
    Ziolko, Bartosz
    [J]. PROCEEDINGS OF THE 2015 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2015, 5 : 193 - 197
  • [9] SEMANTIC WORD EMBEDDING NEURAL NETWORK LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION
    Audhkhasi, Kartik
    Sethy, Abhinav
    Ramabhadran, Bhuvana
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5995 - 5999
  • [10] DYNAMIC ADJUSTMENT OF LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION USING WORD SIMILARITY
    Currey, Anna
    Illina, Irina
    Fohr, Dominique
    [J]. 2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 426 - 432