Latent semantic language modeling for speech recognition

被引:0
|
作者
Bellegarda, JR [1 ]
机构
[1] Apple Comp Inc, Spoken Language Grp, Cupertino, CA 95014 USA
关键词
statistical language modeling; multi-span integration; n-grams; latent semantic analysis; speech recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Statistical language models used in large vocabulary speech recognition must properly capture the various constraints, both local and global, present in the language. While n-gram modeling readily accounts for the former, it has been more difficult to handle the latter, and in particular long-term semantic dependencies, within a suitable data-driven formalism. This paper focuses on the use of latent semantic analysis (LSA) for this purpose. The LSA paradigm automatically uncovers meaningful associations in the language based on word-document co-occurrences in a given corpus. The resulting semantic knowledge is encapsulated in a (continuous) vector space of comparatively low dimension, where are mapped all (discrete) words and documents considered. Comparison in this space is done through a simple similarity measure, so familiar clustering techniques can be applied. This leads to a powerful framework for both automatic semantic classification and semantic language modeling. In the latter case, the large-span nature of LSA models makes them particularly well suited to complement conventional n-grams. This synergy can be harnessed through an integrative formulation, in which latent semantic knowledge is exploited to judiciously adjust the usual n-gram probability. The paper concludes with a discussion of intrinsic trade-offs, such as the influence of training data selection on the resulting performance enhancement.
引用
下载
收藏
页码:73 / 103
页数:31
相关论文
共 50 条
  • [31] Continuous Speech Recognition of Kannada Language using Triphone Modeling
    Sajjan, Sharada C.
    Vijaya, C.
    PROCEEDINGS OF THE 2016 IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2016, : 451 - 455
  • [32] Semantic Communications for Speech Recognition
    Weng, Zhenzi
    Qin, Zhijin
    Li, Geoffrey Ye
    2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
  • [33] Improving Language Modeling with an Adversarial Critic for Automatic Speech Recognition
    Zhang, Yike
    Zhang, Pengyuan
    Yan, Yonghong
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3348 - 3352
  • [34] Using morphemes in language modeling and automatic speech recognition of amharic
    Tachbelie, Martha Yifiru, 1600, Cambridge University Press (20):
  • [35] Discriminatively Trained Dependency Language Modeling for Conversational Speech Recognition
    Lambert, Benjamin
    Raj, Bhiksha
    Singh, Rita
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3381 - 3385
  • [36] Connectionist language modeling for large vocabulary continuous speech recognition
    Schwenk, H
    Gauvain, JL
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 765 - 768
  • [37] Trends and challenges in language modeling for speech recognition and machine translation
    Schwenk, Holger
    2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 23 - 23
  • [38] Language modeling by stochastic dependency grammar for Japanese speech recognition
    Takada, Takahiro
    Hasegawa, Takemitsu
    Ogura, Hisakazu
    Tanaka, Masato
    Yamada, Hiroki
    Komuro, Hiroyuki
    Ishii, Yasushi
    Systems and Computers in Japan, 2001, 32 (12) : 10 - 15
  • [39] Incorporating Proximity Information for Relevance Language Modeling in Speech Recognition
    Chen, Yi-Wen
    Hao, Bo-Han
    Chen, Kuan-Yu
    Chen, Berlin
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2682 - 2686
  • [40] Using morphemes in language modeling and automatic speech recognition of Amharic
    Tachbelie, Martha Yifiru
    Abate, Solomon Teferra
    Menzel, Wolfgang
    NATURAL LANGUAGE ENGINEERING, 2014, 20 (02) : 235 - 259