Synthetic minority oversampling in addressing imbalanced sarcasm detection in social media

被引:0
|
作者
Arghasree Banerjee
Mayukh Bhattacharjee
Kushankur Ghosh
Sankhadeep Chatterjee
机构
[1] University of Engineering & Management,Department of Computer Science & Engineering
来源
Multimedia Tools and Applications | 2020年 / 79卷
关键词
Sarcasm detection; SMOTE; Social media; Imbalanced class; Social emotion; Affective computing;
D O I
暂无
中图分类号
学科分类号
摘要
Recent developments in sarcasm detection have been emerged as extremely successful tools in Social media opinion mining. With the advent of machine learning tools, accurate detection has been made possible. However, the social media data used to train the machine learning models is often ill suited due to the presence of highly imbalanced classes. In absence of any thorough study on the effect of imbalanced classes in sarcasm detection for social media opinion mining, the current article proposed synthetic minority oversampling based methods to mitigate the issue of imbalanced classes which can severely effect the classifier performance in social media sarcasm detection. In the current study, five different variants of synthetic minority oversampling technique have been used on two different datasets of varying sizes. The trustworthiness is judged by training and testing of six well known classifiers and measuring their performance in terms of test phase confusion matrix based performance measuring metrics. The experimental results indicated that SMOTE and BorderlineSMOTE – 1 are extremely successful in improving the classifier performance. A thorough analysis has been performed to better understand the effect of imbalanced classes in social media sarcasm detection.
引用
收藏
页码:35995 / 36031
页数:36
相关论文
共 50 条
  • [41] Indonesian Social Media Sentiment Analysis with Sarcasm Detection
    Lunando, Edwin
    Purwarianti, Ayu
    2013 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS), 2013, : 195 - 198
  • [42] Classification of imbalanced datasets utilizing the synthetic minority oversampling method in conjunction with several machine learning techniques
    Shrayasi Datta
    Chinmoy Ghosh
    J. Pal Choudhury
    Iran Journal of Computer Science, 2025, 8 (1) : 51 - 68
  • [43] Effect of Synthetic Minority Oversampling Technique (SMOTE), Feature Representation, and Classification Algorithm on Imbalanced Sentiment Analysis
    Satriaji, Widi
    Kusumaningrum, Retno
    2018 2ND INTERNATIONAL CONFERENCE ON INFORMATICS AND COMPUTATIONAL SCIENCES (ICICOS), 2018, : 99 - 103
  • [44] Minority Oversampling in Kernel Adaptive Subspaces for Class Imbalanced Datasets
    Lin, Chin-Teng
    Hsieh, Tsung-Yu
    Liu, Yu-Ting
    Lin, Yang-Yin
    Fang, Chieh-Ning
    Wang, Yu-Kai
    Yen, Gary
    Pal, Nikhil R.
    Chuang, Chun-Hsiang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (05) : 950 - 962
  • [45] Iterative minority oversampling and its ensemble for ordinal imbalanced datasets
    Wang, Ning
    Zhang, Zhong-Liang
    Luo, Xing-Gang
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127
  • [46] Multiple Kernel Learning With Minority Oversampling for Classifying Imbalanced Data
    Wang, Ling
    Wang, Hongqiao
    Fu, Guangyuan
    IEEE ACCESS, 2021, 9 : 565 - 580
  • [47] Class-biased sarcasm detection using BiLSTM variational autoencoder-based synthetic oversampling
    Chatterjee, Sankhadeep
    Bhattacharjee, Saranya
    Ghosh, Kushankur
    Das, Asit Kumar
    Banerjee, Soumen
    SOFT COMPUTING, 2023, 27 (09) : 5603 - 5620
  • [48] SPSO: Synthetic protein sequence oversampling for imbalanced protein data and remote homology detection
    Beigi, Majid
    Zell, Andreas
    BIOLOGICAL AND MEDICAL DATA ANALYSIS, PROCEEDINGS, 2006, 4345 : 104 - +
  • [49] HS-Gen: a hypersphere-constrained generation mechanism to improve synthetic minority oversampling for imbalanced classification
    Zuowei He
    Jiaqing Tao
    Qiangkui Leng
    Junchang Zhai
    Changzhong Wang
    Complex & Intelligent Systems, 2023, 9 : 3971 - 3988
  • [50] A Classification Model for Imbalanced Medical Data based on PCA and Farther Distance based Synthetic Minority Oversampling Technique
    Mustafa, Nadir
    Memon, Raheel A.
    Li, Jian-Ping
    Omer, Mohammed Z.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2017, 8 (01) : 61 - 67