Dissecting Contextual Word Embeddings: Architecture and Representation

被引：0

作者：

Peters, Matthew E. ^{[1
]}

Neumann, Mark ^{[1
]}

Zettlemoyer, Luke ^{[2
]}

Yih, Wen-tau ^{[1
]}

机构：

[1] Allen Inst Artificial Intelligence, Seattle, WA 98103 USA

[2] Univ Washington, Paul G Allen Comp Sci & Engn, Seattle, WA 98195 USA

来源：

2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018) | 2018年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Contextual word representations derived from pre-trained bidirectional language models (biLMs) have recently been shown to provide significant improvements to the state of the art for a wide range of NLP tasks. However, many questions remain as to how and why these models are so effective. In this paper, we present a detailed empirical study of how the choice of neural architecture (e.g. LSTM, CNN, or self attention) influences both end task accuracy and qualitative properties of the representations that are learned. We show there is a tradeoff between speed and accuracy, but all architectures learn high quality contextual representations that outperform word embeddings for four challenging NLP tasks. Additionally, all architectures learn representations that vary with network depth, from exclusively morphological based at the word embedding layer through local syntax based in the lower contextual layers to longer range semantics such coreference at the upper layers. Together, these results suggest that unsupervised biLMs, independent of architecture, are learning much more about the structure of language than previously appreciated.

引用

页码：1499 / 1509

页数：11

共 50 条

[21] Contextual Word Embeddings Clustering Through Multiway Analysis: A Comparative Study
Ait-Saada, Mira
Nadif, Mohamed
ADVANCES IN INTELLIGENT DATA ANALYSIS XXI, IDA 2023, 2023, 13876 : 1 - 14
[22] Contextual Generation of Word Embeddings for Out of Vocabulary Words in Downstream Tasks
Garneau, Nicolas
Leboeuf, Jean-Samuel
Lamontagne, Luc
ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11489 : 563 - 569
[23] Circles are like Ellipses, or Ellipses are like Circles? Measuring the Degree of Asymmetry of Static and Contextual Word Embeddings and the Implications to Representation Learning
Zhang, Wei
Campbell, Murray
Yu, Yang
Kumaravel, Sadhana
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 14472 - 14480
[24] Improving document representation using KPCA and clustered word embeddings
Gupta, Aakansha
Katarya, Rahul
2021 5TH INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER TECHNOLOGIES AND OPTIMIZATION TECHNIQUES (ICEECCOT), 2021, : 514 - 517
[25] Eliciting Implicit Evocations Using Word Embeddings and Knowledge Representation
Harispe, Sebastien
Medjkoune, Massissilia
Montmain, Jacky
SCALABLE UNCERTAINTY MANAGEMENT (SUM 2017), 2017, 10564 : 78 - 92
[26] Representation of Semantic Word Embeddings Based on SLDA and Word2vec Model
TANG Huanling
ZHU Hui
WEI Hongmin
ZHENG Han
MAO Xueli
LU Mingyu
GUO Jin
Chinese Journal of Electronics, 2023, 32 (03) : 647 - 654
[27] Representation of Semantic Word Embeddings Based on SLDA and Word2vec Model
Tang Huanling
Zhu Hui
Wei Hongmin
Zheng Han
Mao Xueli
Lu Mingyu
Guo Jin
CHINESE JOURNAL OF ELECTRONICS, 2023, 32 (03) : 647 - 654
[28] Distilling Contextual Embeddings Into A Static Word Embedding For Improving Hacker Forum Analytics
Ampel, Benjamin
Chen, Hsinchun
2021 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2021, : 106 - 108
[29] Large Scale Intent Detection in Turkish Short Sentences with Contextual Word Embeddings
Dundar, Enes Burak
Kilic, Osman Fatih
Cekic, Tolga
Manav, Yusufcan
Deniz, Onur
PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (KDIR), VOL 1, 2020, : 187 - 192
[30] Exploiting Position and Contextual Word Embeddings for Keyphrase Extraction from Scientific Papers
Patel, Krutarth
Caragea, Cornelia
16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1585 - 1591

← 1 2 3 4 5 →