Dissecting Contextual Word Embeddings: Architecture and Representation

被引：0

作者：

Peters, Matthew E. ^{[1
]}

Neumann, Mark ^{[1
]}

Zettlemoyer, Luke ^{[2
]}

Yih, Wen-tau ^{[1
]}

机构：

[1] Allen Inst Artificial Intelligence, Seattle, WA 98103 USA

[2] Univ Washington, Paul G Allen Comp Sci & Engn, Seattle, WA 98195 USA

来源：

2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018) | 2018年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Contextual word representations derived from pre-trained bidirectional language models (biLMs) have recently been shown to provide significant improvements to the state of the art for a wide range of NLP tasks. However, many questions remain as to how and why these models are so effective. In this paper, we present a detailed empirical study of how the choice of neural architecture (e.g. LSTM, CNN, or self attention) influences both end task accuracy and qualitative properties of the representations that are learned. We show there is a tradeoff between speed and accuracy, but all architectures learn high quality contextual representations that outperform word embeddings for four challenging NLP tasks. Additionally, all architectures learn representations that vary with network depth, from exclusively morphological based at the word embedding layer through local syntax based in the lower contextual layers to longer range semantics such coreference at the upper layers. Together, these results suggest that unsupervised biLMs, independent of architecture, are learning much more about the structure of language than previously appreciated.

引用

页码：1499 / 1509

页数：11

共 50 条

[41] Deep Learning Architecture for Part-of-Speech Tagging with Word and Suffix Embeddings
Popov, Alexander
ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, AIMSA 2016, 2016, 9883 : 68 - 77
[42] Contextual Embeddings: When Are They Worth It?
Arora, Simran
May, Avner
Zhang, Jian
Re, Christopher
58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 2650 - 2663
[43] Socialized Word Embeddings
Zeng, Ziqian
Yin, Yichun
Song, Yangqiu
Zhang, Ming
PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3915 - 3921
[44] Urdu Word Embeddings
Haider, Samar
PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 964 - 968
[45] Dynamic Word Embeddings
Bamler, Robert
Mandt, Stephan
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[46] isiZulu Word Embeddings
Dlamini, Sibonelo
Jembere, Edgar
Pillay, Anban
van Niekerk, Brett
2021 CONFERENCE ON INFORMATION COMMUNICATIONS TECHNOLOGY AND SOCIETY (ICTAS), 2021, : 121 - 126
[47] Topical Word Embeddings
Liu, Yang
Liu, Zhiyuan
Chua, Tat-Seng
Sun, Maosong
PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2418 - 2424
[48] Bias in Word Embeddings
Papakyriakopoulos, Orestis
Hegelich, Simon
Serrano, Juan Carlos Medina
Marco, Fabienne
FAT* '20: PROCEEDINGS OF THE 2020 CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, 2020, : 446 - 457
[49] Compressing Word Embeddings
Andrews, Martin
NEURAL INFORMATION PROCESSING, ICONIP 2016, PT IV, 2016, 9950 : 413 - 422
[50] Relational Word Embeddings
Camacho-Collados, Jose
Espinosa-Anke, Luis
Schockaert, Steven
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3286 - 3296

← 1 2 3 4 5 →