Learning to Create and Reuse Words in Open-Vocabulary Neural Language Modeling

被引:8
|
作者
Kawakami, Kazuya [1 ]
Dyer, Chris [2 ]
Blunsom, Phil [1 ,2 ]
机构
[1] Univ Oxford, Dept Comp Sci, Oxford, England
[2] DeepMind, London, England
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.18653/v1/P17-1137
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Fixed-vocabulary language models fail to account for one of the most characteristic statistical facts of natural language: the frequent creation and reuse of new word types. Although character-level language models offer a partial solution in that they can create word types not attested in the training corpus, they do not capture the "bursty" distribution of such words. In this paper, we augment a hierarchical LSTM language model that generates sequences of word tokens character by character with a caching mechanism that learns to reuse previously generated words. To validate our model we construct a new open-vocabulary language modeling corpus (the Multilingual Wikipedia Corpus; MWC) from comparable Wikipedia articles in 7 typologically diverse languages and demonstrate the effectiveness of our model across this range of languages.
引用
收藏
页码:1492 / 1502
页数:11
相关论文
共 50 条
  • [1] Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model
    Du, Yu
    Wei, Fangyun
    Zhang, Zihe
    Shi, Miaojing
    Gao, Yue
    Li, Guoqi
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 14064 - 14073
  • [2] Learning Open-vocabulary Semantic Segmentation Models From Natural Language Supervision
    Xu, Jilan
    Hou, Junlin
    Zhang, Yuejie
    Feng, Rui
    Wang, Yi
    Qiao, Yu
    Xie, Weidi
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2935 - 2944
  • [3] A Hybrid Language Model for Open-Vocabulary Thai LVCSR
    Thangthai, Kwanchiva
    Chotimongkol, Ananlada
    Wutiwiwatchai, Chai
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2206 - 2210
  • [4] A Language-Independent, Open-Vocabulary System Based on HMMs for Recognition of Ultra Low Resolution Words
    Einsele, Farshideh
    Ingold, Rolf
    Hennebert, Jean
    [J]. APPLIED COMPUTING 2008, VOLS 1-3, 2008, : 429 - +
  • [5] A Language-Independent, Open-Vocabulary System Based on HMMs for Recognition of Ultra Low Resolution Words
    Einsele, Farshideh
    Ingold, Rolf
    Hennebert, Jean
    [J]. JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2008, 14 (18) : 2982 - 2997
  • [6] Online Collaborative Learning for Open-Vocabulary Visual Classifiers
    Zhang, Hanwang
    Shang, Xindi
    Yang, Wenzhou
    Xu, Huan
    Luan, Huanbo
    Chua, Tat-Seng
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2809 - 2817
  • [7] LLMFormer: Large Language Model for Open-Vocabulary Semantic Segmentation
    Shi, Hengcan
    Dao, Son Duy
    Cai, Jianfei
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024,
  • [8] Neural keyword confidence estimation for open-vocabulary keyword spotting
    Liu, Zuozhen
    Li, Ta
    Zhang, Pengyuan
    [J]. ELECTRONICS LETTERS, 2022, 58 (03) : 133 - 135
  • [9] Latent human traits in the language of social media: An open-vocabulary approach
    Kulkarni, Vivek
    Kern, Margaret L.
    Stillwell, David
    Kosinski, Michel
    Matz, Sandra
    Ungar, Lyle
    Skiena, Steven
    Schwartz, H. Andrew
    [J]. PLOS ONE, 2018, 13 (11):
  • [10] Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach
    Schwartz, H. Andrew
    Eichstaedt, Johannes C.
    Kern, Margaret L.
    Dziurzynski, Lukasz
    Ramones, Stephanie M.
    Agrawal, Megha
    Shah, Achal
    Kosinski, Michal
    Stillwell, David
    Seligman, Martin E. P.
    Ungar, Lyle H.
    [J]. PLOS ONE, 2013, 8 (09):