Dissecting Contextual Word Embeddings: Architecture and Representation

被引：0

作者：

Peters, Matthew E. ^{[1
]}

Neumann, Mark ^{[1
]}

Zettlemoyer, Luke ^{[2
]}

Yih, Wen-tau ^{[1
]}

机构：

[1] Allen Inst Artificial Intelligence, Seattle, WA 98103 USA

[2] Univ Washington, Paul G Allen Comp Sci & Engn, Seattle, WA 98195 USA

来源：

2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018) | 2018年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Contextual word representations derived from pre-trained bidirectional language models (biLMs) have recently been shown to provide significant improvements to the state of the art for a wide range of NLP tasks. However, many questions remain as to how and why these models are so effective. In this paper, we present a detailed empirical study of how the choice of neural architecture (e.g. LSTM, CNN, or self attention) influences both end task accuracy and qualitative properties of the representations that are learned. We show there is a tradeoff between speed and accuracy, but all architectures learn high quality contextual representations that outperform word embeddings for four challenging NLP tasks. Additionally, all architectures learn representations that vary with network depth, from exclusively morphological based at the word embedding layer through local syntax based in the lower contextual layers to longer range semantics such coreference at the upper layers. Together, these results suggest that unsupervised biLMs, independent of architecture, are learning much more about the structure of language than previously appreciated.

引用

下载

页码：1499 / 1509

页数：11

共 50 条

[1] Personalized Query Expansion with Contextual Word Embeddings
Bassani, Elias
Tonellotto, Nicola
Pasi, Gabriella
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2024, 42 (02)
[2] Fusing contextual word embeddings for concreteness estimation
Incitti, Francesca
Snidaro, Lauro
2021 IEEE 24TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2021, : 508 - 515
[3] Contextual word embeddings for tabular data search and integration
Pilaluisa, Jose
Tomas, David
Navarro-Colorado, Borja
Mazon, Jose-Norberto
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (13): : 9319 - 9333
[4] LEARNED IN SPEECH RECOGNITION: CONTEXTUAL ACOUSTIC WORD EMBEDDINGS
Palaskar, Shruti
Raunak, Vikas
Metze, Florian
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6530 - 6534
[5] Contextual word embeddings for tabular data search and integration
José Pilaluisa
David Tomás
Borja Navarro-Colorado
Jose-Norberto Mazón
Neural Computing and Applications, 2023, 35 : 9319 - 9333
[6] Non-Contextual vs Contextual Word Embeddings in Multiword Expressions Detection
Piasecki, Maciej
Kanclerz, Kamil
COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022, 2022, 13501 : 193 - 206
[7] Contextual and Non-Contextual Word Embeddings: an in-depth Linguistic Investigation
Miaschi, Alessio
Dell'Orletta, Felice
5TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP (REPL4NLP-2020), 2020, : 110 - 119
[8] Dissecting word embeddings and language models in natural language processing
Verma, Vivek Kumar
Pandey, Mrigank
Jain, Tarun
Tiwari, Pradeep Kumar
JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY, 2021, 24 (05): : 1509 - 1515
[9] Aggregating Neural Word Embeddings for Document Representation
Zhang, Ruqing
Guo, Jiafeng
Lan, Yanyan
Xu, Jun
Cheng, Xueqi
ADVANCES IN INFORMATION RETRIEVAL (ECIR 2018), 2018, 10772 : 303 - 315
[10] Explaining Word Embeddings via Disentangled Representation
Liao, Keng-Te
Lee, Cheng-Syuan
Huang, Zhong-Yu
Lin, Shou-de
1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 720 - 725

← 1 2 3 4 5 →