Dissecting Contextual Word Embeddings: Architecture and Representation

被引:0
|
作者
Peters, Matthew E. [1 ]
Neumann, Mark [1 ]
Zettlemoyer, Luke [2 ]
Yih, Wen-tau [1 ]
机构
[1] Allen Inst Artificial Intelligence, Seattle, WA 98103 USA
[2] Univ Washington, Paul G Allen Comp Sci & Engn, Seattle, WA 98195 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Contextual word representations derived from pre-trained bidirectional language models (biLMs) have recently been shown to provide significant improvements to the state of the art for a wide range of NLP tasks. However, many questions remain as to how and why these models are so effective. In this paper, we present a detailed empirical study of how the choice of neural architecture (e.g. LSTM, CNN, or self attention) influences both end task accuracy and qualitative properties of the representations that are learned. We show there is a tradeoff between speed and accuracy, but all architectures learn high quality contextual representations that outperform word embeddings for four challenging NLP tasks. Additionally, all architectures learn representations that vary with network depth, from exclusively morphological based at the word embedding layer through local syntax based in the lower contextual layers to longer range semantics such coreference at the upper layers. Together, these results suggest that unsupervised biLMs, independent of architecture, are learning much more about the structure of language than previously appreciated.
引用
下载
收藏
页码:1499 / 1509
页数:11
相关论文
共 50 条
  • [1] Personalized Query Expansion with Contextual Word Embeddings
    Bassani, Elias
    Tonellotto, Nicola
    Pasi, Gabriella
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2024, 42 (02)
  • [2] Fusing contextual word embeddings for concreteness estimation
    Incitti, Francesca
    Snidaro, Lauro
    2021 IEEE 24TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2021, : 508 - 515
  • [3] Contextual word embeddings for tabular data search and integration
    Pilaluisa, Jose
    Tomas, David
    Navarro-Colorado, Borja
    Mazon, Jose-Norberto
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (13): : 9319 - 9333
  • [4] LEARNED IN SPEECH RECOGNITION: CONTEXTUAL ACOUSTIC WORD EMBEDDINGS
    Palaskar, Shruti
    Raunak, Vikas
    Metze, Florian
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6530 - 6534
  • [5] Contextual word embeddings for tabular data search and integration
    José Pilaluisa
    David Tomás
    Borja Navarro-Colorado
    Jose-Norberto Mazón
    Neural Computing and Applications, 2023, 35 : 9319 - 9333
  • [6] Non-Contextual vs Contextual Word Embeddings in Multiword Expressions Detection
    Piasecki, Maciej
    Kanclerz, Kamil
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022, 2022, 13501 : 193 - 206
  • [7] Contextual and Non-Contextual Word Embeddings: an in-depth Linguistic Investigation
    Miaschi, Alessio
    Dell'Orletta, Felice
    5TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP (REPL4NLP-2020), 2020, : 110 - 119
  • [8] Dissecting word embeddings and language models in natural language processing
    Verma, Vivek Kumar
    Pandey, Mrigank
    Jain, Tarun
    Tiwari, Pradeep Kumar
    JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY, 2021, 24 (05): : 1509 - 1515
  • [9] Aggregating Neural Word Embeddings for Document Representation
    Zhang, Ruqing
    Guo, Jiafeng
    Lan, Yanyan
    Xu, Jun
    Cheng, Xueqi
    ADVANCES IN INFORMATION RETRIEVAL (ECIR 2018), 2018, 10772 : 303 - 315
  • [10] Explaining Word Embeddings via Disentangled Representation
    Liao, Keng-Te
    Lee, Cheng-Syuan
    Huang, Zhong-Yu
    Lin, Shou-de
    1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 720 - 725