Transformer based contextualization of pre-trained word embeddings for irony detection in Twitter

被引：39

作者：

Angel Gonzalez, Jose ^{[1
]}

Hurtado, Lluis-F ^{[1
]}

Pla, Ferran ^{[1
]}

机构：

[1] Univ Politecn Valencia, VRAIN Valencian Res Inst Artificial Intelligence, Cami Vera Sn, Valencia 46022, Spain

来源：

INFORMATION PROCESSING & MANAGEMENT | 2020年 / 57卷 / 04期

关键词：

Irony detection; Twitter; Deep learning; Transformer encoders;

D O I：

10.1016/j.ipm.2020.102262

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Human communication using natural language, specially in social media, is influenced by the use of figurative language like irony. Recently, several workshops are intended to explore the task of irony detection in Twitter by using computational approaches. This paper describes a model for irony detection based on the contextualization of pre-trained Twitter word embeddings by means of the Transformer architecture. This approach is based on the same powerful architecture as BERT but, differently to it, our approach allows us to use in-domain embeddings. We performed an extensive evaluation on two corpora, one for the English language and another for the Spanish language. Our system was the first ranked system in the Spanish corpus and, to our knowledge, it has achieved the second-best result on the English corpus. These results support the correctness and adequacy of our proposal. We also studied and interpreted how the multi-head self-attention mechanisms are specialized on detecting irony by means of considering the polarity and relevance of individual words and even the relationships among words. This analysis is a first step towards understanding how the multi-head self-attention mechanisms of the Transformer architecture address the irony detection problem.

引用

页数：15

共 50 条

[1] Effects of Pre-trained Word Embeddings on Text-based Deception Detection
Nam, David
Yasmin, Jerin
Zulkernine, Farhana
[J]. 2020 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2020, : 437 - 443
[2] Sentiment analysis based on improved pre-trained word embeddings
Rezaeinia, Seyed Mahdi
Rahmani, Rouhollah
Ghodsi, Ali
Veisi, Hadi
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 117 : 139 - 147
[3] Dictionary-based Debiasing of Pre-trained Word Embeddings
Kaneko, Masahiro
Bollegala, Danushka
[J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 212 - 223
[4] The impact of using pre-trained word embeddings in Sinhala chatbots
Gamage, Bimsara
Pushpananda, Randil
Weerasinghe, Ruvan
[J]. 2020 20TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER-2020), 2020, : 161 - 165
[5] Disambiguating Clinical Abbreviations using Pre-trained Word Embeddings
Jaber, Areej
Martinez, Paloma
[J]. HEALTHINF: PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES - VOL. 5: HEALTHINF, 2021, : 501 - 508
[6] Embodying Pre-Trained Word Embeddings Through Robot Actions
Toyoda, Minori
Suzuki, Kanata
Mori, Hiroki
Hayashi, Yoshihiko
Ogata, Tetsuya
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (02) : 4225 - 4232
[7] Gender-preserving Debiasing for Pre-trained Word Embeddings
Kaneko, Masahiro
Bollegala, Danushka
[J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1641 - 1650
[8] Spatial Role Labeling based on Improved Pre-trained Word Embeddings and Transfer Learning
Moussa, Alaeddine
Fournier, Sebastien
Mahmoudi, Khaoula
Espinasse, Bernard
Faiz, Sami
[J]. KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KSE 2021), 2021, 192 : 1218 - 1226
[9] Automated Employee Objective Matching Using Pre-trained Word Embeddings
Ghanem, Mohab
Elnaggar, Ahmed
Mckinnon, Adam
Debes, Christian
Boisard, Olivier
Matthes, Florian
[J]. 2021 IEEE 25TH INTERNATIONAL ENTERPRISE DISTRIBUTED OBJECT COMPUTING CONFERENCE (EDOC 2021), 2021, : 51 - 60
[10] Investigating the Impact of Pre-trained Word Embeddings on Memorization in Neural Networks
Thomas, Aleena
Adelani, David Ifeoluwa
Davody, Ali
Mogadala, Aditya
Klakow, Dietrich
[J]. TEXT, SPEECH, AND DIALOGUE (TSD 2020), 2020, 12284 : 273 - 281

← 1 2 3 4 5 →