Ensemble Deep Learning for Multilabel Binary Classification of User-Generated Content

被引:36
|
作者
Haralabopoulos, Giannis [1 ]
Anagnostopoulos, Ioannis [2 ]
McAuley, Derek [1 ]
机构
[1] Univ Nottingham, Sch Comp Sci, Nottingham NG8 1BB, England
[2] Univ Thessaly, Dept Comp Sci & Biomed Informat, Lamia 35131, Greece
基金
英国工程与自然科学研究理事会;
关键词
ensemble learning; sentiment analysis; multilabel classification; deep neural networks; pure emotion; Semeval; 2018; Task; 1; toxic comment classification; SENTIMENT ANALYSIS; DIFFERENTIAL EVOLUTION; NEURAL-NETWORKS;
D O I
10.3390/a13040083
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentiment analysis usually refers to the analysis of human-generated content via a polarity filter. Affective computing deals with the exact emotions conveyed through information. Emotional information most frequently cannot be accurately described by a single emotion class. Multilabel classifiers can categorize human-generated content in multiple emotional classes. Ensemble learning can improve the statistical, computational and representation aspects of such classifiers. We present a baseline stacked ensemble and propose a weighted ensemble. Our proposed weighted ensemble can use multiple classifiers to improve classification results without hyperparameter tuning or data overfitting. We evaluate our ensemble models with two datasets. The first dataset is from Semeval2018-Task 1 and contains almost 7000 Tweets, labeled with 11 sentiment classes. The second dataset is the Toxic Comment Dataset with more than 150,000 comments, labeled with six different levels of abuse or harassment. Our results suggest that ensemble learning improves classification results by 1.5% to 5.4%.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] USER-GENERATED CONTENT AS WORD-OF-MOUTH
    Ramirez, Edward
    Gau, Roland
    Hadjimarcou, John
    Xu, Zhenning
    JOURNAL OF MARKETING THEORY AND PRACTICE, 2018, 26 (1-2) : 90 - 98
  • [42] User-Generated Content: The Case for Mobile Services
    Jensen, Christian S.
    Vicente, Carmen Ruiz
    Wind, Rico
    COMPUTER, 2008, 41 (12) : 115 - 117
  • [43] Detecting coverage bias in user-generated content
    Kerkhof, Anna
    Munster, Johannes
    JOURNAL OF MEDIA ECONOMICS, 2019, 32 (3-4) : 99 - 130
  • [44] Integrating User-Generated Content and Pervasive Communications
    Baladron, Carlos
    Aguiar, Javier
    Carro, Belen
    Sanchez-Esguevillas, Antonio
    Baldauf, Matthias
    Froehlich, Peter
    Musialski, Przemyslaw
    Falcarin, Paolo
    Rocha, Oscar Rodriguez
    Costabello, Luca
    Goix, Laurent Walter
    Cadenas, Alejandro
    Sanchez-Esguevillas, Antonio
    Carro, Belen
    Raibulet, Claudia
    Ubezio, Luigi
    Valle, Enrico
    Serrano, Martin
    Foghlu, Micheal O.
    Strassner, John
    IEEE PERVASIVE COMPUTING, 2008, 7 (04) : 58 - 61
  • [45] Silos, us, them, and user-generated content
    Ojala, Marydee
    ONLINE, 2007, 31 (04): : 5 - 5
  • [46] USER-GENERATED CONTENT AND GATEKEEPING AT THE BBC HUB
    Harrison, Jackie
    JOURNALISM STUDIES, 2010, 11 (02) : 243 - 256
  • [47] User-Generated Content and Bias in News Media
    Yildirim, Pinar
    Gal-Or, Esther
    Geylani, Tansev
    MANAGEMENT SCIENCE, 2013, 59 (12) : 2655 - 2666
  • [48] Recognizing Musical Entities in User-generated Content
    Porcaro, Lorenzo
    Saggion, Horacio
    COMPUTACION Y SISTEMAS, 2019, 23 (03): : 1079 - 1088
  • [49] Anonymous authorship control for user-generated content
    Lee, Suk-Bong
    Sim, Sang-Gyoo
    Kim, Yeo-Jin
    Oh, Yun-Sang
    Jung, Kyung-Im
    Noh, Bong-Nam
    WMSCI 2007: 11TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL I, PROCEEDINGS, 2007, : 77 - +
  • [50] From user-generated content to a user-generated aesthetic: Instagram, corporate vernacularization, and the intimate life of brands
    Simatzkin-Ohana, Liron
    Frosh, Paul
    MEDIA CULTURE & SOCIETY, 2022, 44 (07) : 1235 - 1254