Transformer based contextualization of pre-trained word embeddings for irony detection in Twitter

被引:39
|
作者
Angel Gonzalez, Jose [1 ]
Hurtado, Lluis-F [1 ]
Pla, Ferran [1 ]
机构
[1] Univ Politecn Valencia, VRAIN Valencian Res Inst Artificial Intelligence, Cami Vera Sn, Valencia 46022, Spain
关键词
Irony detection; Twitter; Deep learning; Transformer encoders;
D O I
10.1016/j.ipm.2020.102262
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Human communication using natural language, specially in social media, is influenced by the use of figurative language like irony. Recently, several workshops are intended to explore the task of irony detection in Twitter by using computational approaches. This paper describes a model for irony detection based on the contextualization of pre-trained Twitter word embeddings by means of the Transformer architecture. This approach is based on the same powerful architecture as BERT but, differently to it, our approach allows us to use in-domain embeddings. We performed an extensive evaluation on two corpora, one for the English language and another for the Spanish language. Our system was the first ranked system in the Spanish corpus and, to our knowledge, it has achieved the second-best result on the English corpus. These results support the correctness and adequacy of our proposal. We also studied and interpreted how the multi-head self-attention mechanisms are specialized on detecting irony by means of considering the polarity and relevance of individual words and even the relationships among words. This analysis is a first step towards understanding how the multi-head self-attention mechanisms of the Transformer architecture address the irony detection problem.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Effects of Pre-trained Word Embeddings on Text-based Deception Detection
    Nam, David
    Yasmin, Jerin
    Zulkernine, Farhana
    [J]. 2020 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2020, : 437 - 443
  • [2] Sentiment analysis based on improved pre-trained word embeddings
    Rezaeinia, Seyed Mahdi
    Rahmani, Rouhollah
    Ghodsi, Ali
    Veisi, Hadi
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 117 : 139 - 147
  • [3] Dictionary-based Debiasing of Pre-trained Word Embeddings
    Kaneko, Masahiro
    Bollegala, Danushka
    [J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 212 - 223
  • [4] The impact of using pre-trained word embeddings in Sinhala chatbots
    Gamage, Bimsara
    Pushpananda, Randil
    Weerasinghe, Ruvan
    [J]. 2020 20TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER-2020), 2020, : 161 - 165
  • [5] Disambiguating Clinical Abbreviations using Pre-trained Word Embeddings
    Jaber, Areej
    Martinez, Paloma
    [J]. HEALTHINF: PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES - VOL. 5: HEALTHINF, 2021, : 501 - 508
  • [6] Embodying Pre-Trained Word Embeddings Through Robot Actions
    Toyoda, Minori
    Suzuki, Kanata
    Mori, Hiroki
    Hayashi, Yoshihiko
    Ogata, Tetsuya
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (02) : 4225 - 4232
  • [7] Gender-preserving Debiasing for Pre-trained Word Embeddings
    Kaneko, Masahiro
    Bollegala, Danushka
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1641 - 1650
  • [8] Spatial Role Labeling based on Improved Pre-trained Word Embeddings and Transfer Learning
    Moussa, Alaeddine
    Fournier, Sebastien
    Mahmoudi, Khaoula
    Espinasse, Bernard
    Faiz, Sami
    [J]. KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KSE 2021), 2021, 192 : 1218 - 1226
  • [9] Automated Employee Objective Matching Using Pre-trained Word Embeddings
    Ghanem, Mohab
    Elnaggar, Ahmed
    Mckinnon, Adam
    Debes, Christian
    Boisard, Olivier
    Matthes, Florian
    [J]. 2021 IEEE 25TH INTERNATIONAL ENTERPRISE DISTRIBUTED OBJECT COMPUTING CONFERENCE (EDOC 2021), 2021, : 51 - 60
  • [10] Investigating the Impact of Pre-trained Word Embeddings on Memorization in Neural Networks
    Thomas, Aleena
    Adelani, David Ifeoluwa
    Davody, Ali
    Mogadala, Aditya
    Klakow, Dietrich
    [J]. TEXT, SPEECH, AND DIALOGUE (TSD 2020), 2020, 12284 : 273 - 281