Ensemble Deep Learning for Multilabel Binary Classification of User-Generated Content

被引:36
|
作者
Haralabopoulos, Giannis [1 ]
Anagnostopoulos, Ioannis [2 ]
McAuley, Derek [1 ]
机构
[1] Univ Nottingham, Sch Comp Sci, Nottingham NG8 1BB, England
[2] Univ Thessaly, Dept Comp Sci & Biomed Informat, Lamia 35131, Greece
基金
英国工程与自然科学研究理事会;
关键词
ensemble learning; sentiment analysis; multilabel classification; deep neural networks; pure emotion; Semeval; 2018; Task; 1; toxic comment classification; SENTIMENT ANALYSIS; DIFFERENTIAL EVOLUTION; NEURAL-NETWORKS;
D O I
10.3390/a13040083
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentiment analysis usually refers to the analysis of human-generated content via a polarity filter. Affective computing deals with the exact emotions conveyed through information. Emotional information most frequently cannot be accurately described by a single emotion class. Multilabel classifiers can categorize human-generated content in multiple emotional classes. Ensemble learning can improve the statistical, computational and representation aspects of such classifiers. We present a baseline stacked ensemble and propose a weighted ensemble. Our proposed weighted ensemble can use multiple classifiers to improve classification results without hyperparameter tuning or data overfitting. We evaluate our ensemble models with two datasets. The first dataset is from Semeval2018-Task 1 and contains almost 7000 Tweets, labeled with 11 sentiment classes. The second dataset is the Toxic Comment Dataset with more than 150,000 comments, labeled with six different levels of abuse or harassment. Our results suggest that ensemble learning improves classification results by 1.5% to 5.4%.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] An Ensemble Method for the Credibility Assessment of User-Generated Content
    Fontanarava, Julien
    Pasi, Gabriella
    Viviani, Marco
    2017 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2017), 2017, : 863 - 868
  • [2] Deep learning based sentiment classification on user-generated big data
    Kumar A.
    Jaiswal A.
    Jaiswal, Arunima (arunimajaiswal@gmail.com), 1600, Bentham Science Publishers (13): : 1047 - 1056
  • [3] Learning opinions in user-generated web content
    Sokolova, M.
    Lapalme, G.
    NATURAL LANGUAGE ENGINEERING, 2011, 17 : 541 - 567
  • [4] User-generated content
    Greenfield, David
    CONTROL ENGINEERING, 2009, 56 (10) : 2 - 2
  • [5] User-generated content
    Wofford, Jennifer
    NEW MEDIA & SOCIETY, 2012, 14 (07) : 1236 - 1239
  • [6] User-Generated Content Introduction
    Krumm, John
    Davies, Nigel
    Narayanaswami, Chandra
    IEEE PERVASIVE COMPUTING, 2008, 7 (04) : 10 - 11
  • [7] Differentiation with User-Generated Content
    Zhang, Kaifu
    Sarvary, Miklos
    MANAGEMENT SCIENCE, 2015, 61 (04) : 898 - 914
  • [8] Nonintrusive Perceptual Audio Quality Assessment for User-Generated Content Using Deep Learning
    Mumtaz, Deebha
    Jakhetiya, Vinit
    Nathwani, Karan
    Subudhi, Badri Narayan
    Guntuku, Sharath Chandra
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (11) : 7780 - 7789
  • [9] The Power of User-Generated Content
    Jagger P.
    ITNOW, 2023, 65 (01) : 32 - 33
  • [10] User-generated content and the law
    Holmes, Steve
    Ganley, Paul
    JOURNAL OF INTELLECTUAL PROPERTY LAW & PRACTICE, 2007, 2 (05) : 338 - 344