Spell Once, Summon Anywhere: A Two-Level Open-Vocabulary Language Model

被引:0
|
作者
Mielke, Sebastian J. [1 ]
Eisner, Jason [1 ]
机构
[1] Johns Hopkins Univ, Dept Comp Sci, Baltimore, MD 21218 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We show how the spellings of known words can help us deal with unknown words in open-vocabulary NLP tasks. The method we propose can be used to extend any closed-vocabulary generative model, but in this paper we specifically consider the case of neural language modeling. Our Bayesian generative story combines a standard RNN language model (generating the word tokens in each sentence) with an RNN-based spelling model (generating the letters in each word type). These two RNNs respectively capture sentence structure and word structure, and are kept separate as in linguistics. By invoking the second RNN to generate spellings for novel words in context, we obtain an open-vocabulary language model. For known words, embeddings are naturally inferred by combining evidence from type spelling and token context. Comparing to baselines (including a novel strong baseline), we beat previous work and establish state-of-the-art results on multiple datasets.
引用
收藏
页码:6843 / 6850
页数:8
相关论文
共 50 条
  • [1] A Hybrid Language Model for Open-Vocabulary Thai LVCSR
    Thangthai, Kwanchiva
    Chotimongkol, Ananlada
    Wutiwiwatchai, Chai
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2206 - 2210
  • [2] LLMFormer: Large Language Model for Open-Vocabulary Semantic Segmentation
    Shi, Hengcan
    Dao, Son Duy
    Cai, Jianfei
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024,
  • [3] Can Identifier Splitting Improve Open-Vocabulary Language Model of Code
    Shi, Jieke
    Yang, Zhou
    He, Junda
    Xu, Bowen
    Lo, David
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2022), 2022, : 1134 - 1138
  • [4] Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model
    Du, Yu
    Wei, Fangyun
    Zhang, Zihe
    Shi, Miaojing
    Gao, Yue
    Li, Guoqi
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 14064 - 14073
  • [5] From Characters toWords: Hierarchical Pre-trained Language Model for Open-vocabulary Language Understanding
    Sun, Li
    Luisier, Florian
    Batmanghelich, Kayhan
    Florencio, Dinei
    Zhang, Cha
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 3605 - 3620
  • [6] A language model using variable length tokens for open-vocabulary Hangul text recognition
    Ryu, SH
    Kim, JH
    [J]. PATTERN RECOGNITION, 2004, 37 (07) : 1549 - 1552
  • [7] Latent human traits in the language of social media: An open-vocabulary approach
    Kulkarni, Vivek
    Kern, Margaret L.
    Stillwell, David
    Kosinski, Michel
    Matz, Sandra
    Ungar, Lyle
    Skiena, Steven
    Schwartz, H. Andrew
    [J]. PLOS ONE, 2018, 13 (11):
  • [8] Learning to Create and Reuse Words in Open-Vocabulary Neural Language Modeling
    Kawakami, Kazuya
    Dyer, Chris
    Blunsom, Phil
    [J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 1492 - 1502
  • [9] Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach
    Schwartz, H. Andrew
    Eichstaedt, Johannes C.
    Kern, Margaret L.
    Dziurzynski, Lukasz
    Ramones, Stephanie M.
    Agrawal, Megha
    Shah, Achal
    Kosinski, Michal
    Stillwell, David
    Seligman, Martin E. P.
    Ungar, Lyle H.
    [J]. PLOS ONE, 2013, 8 (09):
  • [10] Open-Vocabulary Multi-label Image Classification with Pretrained Vision-Language Model
    Dao, Son D.
    Huynh, Dat
    Zhao, He
    Phung, Dinh
    Cai, Jianfei
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2135 - 2140