Spell Once, Summon Anywhere: A Two-Level Open-Vocabulary Language Model

被引：0

作者：

Mielke, Sebastian J. ^{[1
]}

Eisner, Jason ^{[1
]}

机构：

[1] Johns Hopkins Univ, Dept Comp Sci, Baltimore, MD 21218 USA

来源：

THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2019年

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We show how the spellings of known words can help us deal with unknown words in open-vocabulary NLP tasks. The method we propose can be used to extend any closed-vocabulary generative model, but in this paper we specifically consider the case of neural language modeling. Our Bayesian generative story combines a standard RNN language model (generating the word tokens in each sentence) with an RNN-based spelling model (generating the letters in each word type). These two RNNs respectively capture sentence structure and word structure, and are kept separate as in linguistics. By invoking the second RNN to generate spellings for novel words in context, we obtain an open-vocabulary language model. For known words, embeddings are naturally inferred by combining evidence from type spelling and token context. Comparing to baselines (including a novel strong baseline), we beat previous work and establish state-of-the-art results on multiple datasets.

引用

页码：6843 / 6850

页数：8

共 50 条

[1] A Hybrid Language Model for Open-Vocabulary Thai LVCSR
Thangthai, Kwanchiva
Chotimongkol, Ananlada
Wutiwiwatchai, Chai
[J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2206 - 2210
[2] LLMFormer: Large Language Model for Open-Vocabulary Semantic Segmentation
Shi, Hengcan
Dao, Son Duy
Cai, Jianfei
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024,
[3] Can Identifier Splitting Improve Open-Vocabulary Language Model of Code
Shi, Jieke
Yang, Zhou
He, Junda
Xu, Bowen
Lo, David
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2022), 2022, : 1134 - 1138
[4] Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model
Du, Yu
Wei, Fangyun
Zhang, Zihe
Shi, Miaojing
Gao, Yue
Li, Guoqi
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 14064 - 14073
[5] From Characters toWords: Hierarchical Pre-trained Language Model for Open-vocabulary Language Understanding
Sun, Li
Luisier, Florian
Batmanghelich, Kayhan
Florencio, Dinei
Zhang, Cha
[J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 3605 - 3620
[6] A language model using variable length tokens for open-vocabulary Hangul text recognition
Ryu, SH
Kim, JH
[J]. PATTERN RECOGNITION, 2004, 37 (07) : 1549 - 1552
[7] Latent human traits in the language of social media: An open-vocabulary approach
Kulkarni, Vivek
Kern, Margaret L.
Stillwell, David
Kosinski, Michel
Matz, Sandra
Ungar, Lyle
Skiena, Steven
Schwartz, H. Andrew
[J]. PLOS ONE, 2018, 13 (11):
[8] Learning to Create and Reuse Words in Open-Vocabulary Neural Language Modeling
Kawakami, Kazuya
Dyer, Chris
Blunsom, Phil
[J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 1492 - 1502
[9] Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach
Schwartz, H. Andrew
Eichstaedt, Johannes C.
Kern, Margaret L.
Dziurzynski, Lukasz
Ramones, Stephanie M.
Agrawal, Megha
Shah, Achal
Kosinski, Michal
Stillwell, David
Seligman, Martin E. P.
Ungar, Lyle H.
[J]. PLOS ONE, 2013, 8 (09):
[10] Open-Vocabulary Multi-label Image Classification with Pretrained Vision-Language Model
Dao, Son D.
Huynh, Dat
Zhao, He
Phung, Dinh
Cai, Jianfei
[J]. 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2135 - 2140

← 1 2 3 4 5 →