Synthetic minority oversampling in addressing imbalanced sarcasm detection in social media

被引:17
|
作者
Banerjee, Arghasree [1 ]
Bhattacharjee, Mayukh [1 ]
Ghosh, Kushankur [1 ]
Chatterjee, Sankhadeep [1 ]
机构
[1] Univ Engn & Management, Dept Comp Sci & Engn, Kolkata, India
关键词
Sarcasm detection; SMOTE; Social media; Imbalanced class; Social emotion; Affective computing; DECISION TREE; SENTIMENT ANALYSIS; NAIVE BAYES; CLASSIFICATION; SMOTE; ALGORITHM;
D O I
10.1007/s11042-020-09138-4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent developments in sarcasm detection have been emerged as extremely successful tools in Social media opinion mining. With the advent of machine learning tools, accurate detection has been made possible. However, the social media data used to train the machine learning models is often ill suited due to the presence of highly imbalanced classes. In absence of any thorough study on the effect of imbalanced classes in sarcasm detection for social media opinion mining, the current article proposed synthetic minority oversampling based methods to mitigate the issue of imbalanced classes which can severely effect the classifier performance in social media sarcasm detection. In the current study, five different variants of synthetic minority oversampling technique have been used on two different datasets of varying sizes. The trustworthiness is judged by training and testing of six well known classifiers and measuring their performance in terms of test phase confusion matrix based performance measuring metrics. The experimental results indicated that SMOTE and BorderlineSMOTE - 1 are extremely successful in improving the classifier performance. A thorough analysis has been performed to better understand the effect of imbalanced classes in social media sarcasm detection.
引用
收藏
页码:35995 / 36031
页数:37
相关论文
共 50 条
  • [1] Synthetic minority oversampling in addressing imbalanced sarcasm detection in social media
    Arghasree Banerjee
    Mayukh Bhattacharjee
    Kushankur Ghosh
    Sankhadeep Chatterjee
    [J]. Multimedia Tools and Applications, 2020, 79 : 35995 - 36031
  • [2] Sarcasm Detection in Social Media Based on Imbalanced Classification
    Liu, Peng
    Chen, Wei
    Ou, Gaoyan
    Wang, Tengjiao
    Yang, Dongqing
    Lei, Kai
    [J]. WEB-AGE INFORMATION MANAGEMENT, WAIM 2014, 2014, 8485 : 459 - 471
  • [3] An improved and random synthetic minority oversampling technique for imbalanced data
    Wei, Guoliang
    Mu, Weimeng
    Song, Yan
    Dou, Jun
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 248
  • [4] Imbalanced Classification Based on Minority Clustering Synthetic Minority Oversampling Technique With Wind Turbine Fault Detection Application
    Yi, Huaikuan
    Jiang, Qingchao
    Yan, Xuefeng
    Wang, Bei
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2021, 17 (09) : 5867 - 5875
  • [5] A Novel Synthetic Minority Oversampling Technique for Imbalanced Data Set Learning
    Barua, Sukarna
    Islam, Md. Monirul
    Murase, Kazuyuki
    [J]. NEURAL INFORMATION PROCESSING, PT II, 2011, 7063 : 735 - +
  • [6] A Synthetic Minority Based on Probabilistic Distribution (SyMProD) Oversampling for Imbalanced Datasets
    Kunakorntum, Intouch
    Hinthong, Woranich
    Phunchongharn, Phond
    [J]. IEEE ACCESS, 2020, 8 : 114692 - 114704
  • [7] Performance of Synthetic Minority Oversampling Technique on Imbalanced Breast Cancer Data
    Rani, K. Usha
    Ramadevi, G. Naga
    Lavanya, D.
    [J]. PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 1623 - 1627
  • [8] A minority oversampling approach for fault detection with heterogeneous imbalanced data
    Liu, Jie
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 184
  • [9] Fuzzy-synthetic minority oversampling technique: Oversampling based on fuzzy set theory for Android malware detection in imbalanced datasets
    Xu, Yanping
    Wu, Chunhua
    Zheng, Kangfeng
    Niu, Xinxin
    Yang, Yixian
    [J]. INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2017, 13 (04):
  • [10] Minority oversampling for imbalanced ordinal regression
    Zhu, Tuanfei
    Lin, Yaping
    Liu, Yonghe
    Zhang, Wei
    Zhang, Jianming
    [J]. KNOWLEDGE-BASED SYSTEMS, 2019, 166 : 140 - 155