Dissecting Contextual Word Embeddings: Architecture and Representation

被引:0
|
作者
Peters, Matthew E. [1 ]
Neumann, Mark [1 ]
Zettlemoyer, Luke [2 ]
Yih, Wen-tau [1 ]
机构
[1] Allen Inst Artificial Intelligence, Seattle, WA 98103 USA
[2] Univ Washington, Paul G Allen Comp Sci & Engn, Seattle, WA 98195 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Contextual word representations derived from pre-trained bidirectional language models (biLMs) have recently been shown to provide significant improvements to the state of the art for a wide range of NLP tasks. However, many questions remain as to how and why these models are so effective. In this paper, we present a detailed empirical study of how the choice of neural architecture (e.g. LSTM, CNN, or self attention) influences both end task accuracy and qualitative properties of the representations that are learned. We show there is a tradeoff between speed and accuracy, but all architectures learn high quality contextual representations that outperform word embeddings for four challenging NLP tasks. Additionally, all architectures learn representations that vary with network depth, from exclusively morphological based at the word embedding layer through local syntax based in the lower contextual layers to longer range semantics such coreference at the upper layers. Together, these results suggest that unsupervised biLMs, independent of architecture, are learning much more about the structure of language than previously appreciated.
引用
收藏
页码:1499 / 1509
页数:11
相关论文
共 50 条
  • [21] Contextual Word Embeddings Clustering Through Multiway Analysis: A Comparative Study
    Ait-Saada, Mira
    Nadif, Mohamed
    ADVANCES IN INTELLIGENT DATA ANALYSIS XXI, IDA 2023, 2023, 13876 : 1 - 14
  • [22] Contextual Generation of Word Embeddings for Out of Vocabulary Words in Downstream Tasks
    Garneau, Nicolas
    Leboeuf, Jean-Samuel
    Lamontagne, Luc
    ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11489 : 563 - 569
  • [23] Circles are like Ellipses, or Ellipses are like Circles? Measuring the Degree of Asymmetry of Static and Contextual Word Embeddings and the Implications to Representation Learning
    Zhang, Wei
    Campbell, Murray
    Yu, Yang
    Kumaravel, Sadhana
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 14472 - 14480
  • [24] Improving document representation using KPCA and clustered word embeddings
    Gupta, Aakansha
    Katarya, Rahul
    2021 5TH INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER TECHNOLOGIES AND OPTIMIZATION TECHNIQUES (ICEECCOT), 2021, : 514 - 517
  • [25] Eliciting Implicit Evocations Using Word Embeddings and Knowledge Representation
    Harispe, Sebastien
    Medjkoune, Massissilia
    Montmain, Jacky
    SCALABLE UNCERTAINTY MANAGEMENT (SUM 2017), 2017, 10564 : 78 - 92
  • [26] Representation of Semantic Word Embeddings Based on SLDA and Word2vec Model
    TANG Huanling
    ZHU Hui
    WEI Hongmin
    ZHENG Han
    MAO Xueli
    LU Mingyu
    GUO Jin
    Chinese Journal of Electronics, 2023, 32 (03) : 647 - 654
  • [27] Representation of Semantic Word Embeddings Based on SLDA and Word2vec Model
    Tang Huanling
    Zhu Hui
    Wei Hongmin
    Zheng Han
    Mao Xueli
    Lu Mingyu
    Guo Jin
    CHINESE JOURNAL OF ELECTRONICS, 2023, 32 (03) : 647 - 654
  • [28] Distilling Contextual Embeddings Into A Static Word Embedding For Improving Hacker Forum Analytics
    Ampel, Benjamin
    Chen, Hsinchun
    2021 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2021, : 106 - 108
  • [29] Large Scale Intent Detection in Turkish Short Sentences with Contextual Word Embeddings
    Dundar, Enes Burak
    Kilic, Osman Fatih
    Cekic, Tolga
    Manav, Yusufcan
    Deniz, Onur
    PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (KDIR), VOL 1, 2020, : 187 - 192
  • [30] Exploiting Position and Contextual Word Embeddings for Keyphrase Extraction from Scientific Papers
    Patel, Krutarth
    Caragea, Cornelia
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1585 - 1591