On the Importance of Word Embedding in Automated Harmful Information Detection

被引:1
|
作者
Mohtaj, Salar [1 ,2 ]
Moeller, Sebastian [1 ,2 ]
机构
[1] Tech Univ Berlin, Berlin, Germany
[2] German Res Ctr Artificial Intelligence DFKI, Lab Berlin, Germany
来源
关键词
Fake news detection; Hate speech detection; Word embedding; Contextual word embedding; HATE SPEECH; IDENTIFICATION;
D O I
10.1007/978-3-031-16270-1_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Social media have been growing rapidly during past years. They changed different aspects of human life, especially how people communicate and also how people access information. However, along with the important benefits, social media causes a number of significant challenges since they were introduced. Spreading of fake news and hate speech are among the most challenging issues which have attracted a lot of attention by researchers in past years. Different models based on natural language processing are developed to combat these phenomena and stop them in the early stages before mass spreading. Considering the difficulty of the task of automated harmful information detection (i.e., fake news and hate speech detection), every single step of the detection process could have a sensible impact on the performance of models. In this paper, we study the importance of word embedding on the overall performance of deep neural network architecture on the detection of fake news and hate speech on social media. We test various approaches for converting raw input text into vectors, from random weighting to state-of-the-art contextual word embedding models. In addition, to compare different word embedding approaches, we also analyze different strategies to get the vectors from contextual word embedding models (i.e., get the weights from the last layer, against averaging weights of the last layers). Our results show that XLNet embedding outperforms the other embedding approaches on both tasks related to harmful information identification.
引用
收藏
页码:251 / 262
页数:12
相关论文
共 50 条
  • [21] A new word embedding approach to evaluate potential fixes for automated program repair
    Amorim, Leonardo Afonso
    Freitas, Mateus F.
    Dantas, Altino
    de Souza, Eduardo E.
    Camilo-Junior, Celso G.
    Martins, Wellington S.
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [22] WELMSD - word embedding and language model based sarcasm detection
    Kumar, Pradeep
    Sarin, Gaurav
    ONLINE INFORMATION REVIEW, 2022, 46 (07) : 1242 - 1256
  • [23] Systematic Homonym Detection and Replacement Based on Contextual Word Embedding
    Younghoon Lee
    Neural Processing Letters, 2021, 53 : 17 - 36
  • [24] Remote Sensing Image Detection and Segmentation Based on Word Embedding
    You H.-F.
    Tian S.-W.
    Yu L.
    Lü Y.-L.
    Tian, Sheng-Wei (tianshengwei@163.com), 1600, Chinese Institute of Electronics (48): : 75 - 83
  • [25] Online Deceptive Product Review Detection Leveraging Word Embedding
    Li, Xiu
    Xie, Lulu
    Zhang, Fan
    Wang, Huimin
    2017 IEEE 15TH INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, 15TH INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, 3RD INTL CONF ON BIG DATA INTELLIGENCE AND COMPUTING AND CYBER SCIENCE AND TECHNOLOGY CONGRESS(DASC/PICOM/DATACOM/CYBERSCI, 2017, : 867 - 870
  • [26] Word embedding and classification methods and their effects on fake news detection
    Hauschild, Jessica
    Eskridge, Kent
    MACHINE LEARNING WITH APPLICATIONS, 2024, 17
  • [27] Hybrid Phishing URL Detection Using Segmented Word Embedding
    Aung, Eint Sandi
    Yamana, Hayato
    INFORMATION INTEGRATION AND WEB INTELLIGENCE, IIWAS 2022, 2022, 13635 : 507 - 518
  • [28] Cyberbullying Detection using Deep Learning and Word Embedding Analysis
    On, Elif Pinar
    Yeniterzi, Reyyan
    2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
  • [29] Multi-Modal Sarcasm Detection with Sentiment Word Embedding
    Fu, Hao
    Liu, Hao
    Wang, Hongling
    Xu, Linyan
    Lin, Jiali
    Jiang, Dazhi
    ELECTRONICS, 2024, 13 (05)
  • [30] Systematic Homonym Detection and Replacement Based on Contextual Word Embedding
    Lee, Younghoon
    NEURAL PROCESSING LETTERS, 2021, 53 (01) : 17 - 36