Use of Data Augmentation Techniques in Detection of Antisocial Behavior Using Deep Learning Methods

被引:7
|
作者
Maslej-Kresnakova, Viera [1 ]
Sarnovsky, Martin [1 ]
Jackova, Julia [1 ]
机构
[1] Tech Univ Kosice, Fac Elect Engn & Informat, Dept Cybernet & Artificial Intelligence, Kosice 04001, Slovakia
来源
FUTURE INTERNET | 2022年 / 14卷 / 09期
关键词
data augmentation; FDA; deep learning; antisocial behavior; fake news detection; toxic comments;
D O I
10.3390/fi14090260
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The work presented in this paper focuses on the use of data augmentation techniques applied in the domain of the detection of antisocial behavior. Data augmentation is a frequently used approach to overcome issues related to the lack of data or problems related to imbalanced classes. Such techniques are used to generate artificial data samples used to improve the volume of the training set or to balance the target distribution. In the antisocial behavior detection domain, we frequently face both issues, the lack of quality labeled data as well as class imbalance. As the majority of the data in this domain is textual, we must consider augmentation methods suitable for NLP tasks. Easy data augmentation (EDA) represents a group of such methods utilizing simple text transformations to create the new, artificial samples. Our main motivation is to explore EDA techniques' usability on the selected tasks from the antisocial behavior detection domain. We focus on the class imbalance problem and apply EDA techniques to two problems: fake news and toxic comments classification. In both cases, we train the convolutional neural networks classifier and compare its performance on the original and EDA-extended datasets. EDA techniques prove to be very task-dependent, with certain limitations resulting from the data they are applied on. The model's performance on the extended toxic comments dataset did improve only marginally, gaining only 0.01 improvement in the F1 metric when applying only a subset of EDA methods. EDA techniques in this case were not suitable enough to handle texts written in more informal language. On the other hand, on the fake news dataset, the performance was improved more significantly, boosting the F1 score by 0.1. Improvement was most significant in the prediction of the minor class, where F1 improved from 0.67 to 0.86.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Realistic SAR Data Augmentation using Machine Learning Techniques
    Lewis, Benjamin
    DeGuchy, Omar
    Sebastian, Joseph
    Kaminski, John
    ALGORITHMS FOR SYNTHETIC APERTURE RADAR IMAGERY XXVI, 2019, 10987
  • [42] A deep learning model using data augmentation for detection of architectural distortion in whole and patches of images
    Oyelade, Olaide N.
    Ezugwu, Absalom E.
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 65
  • [43] Exploring the octanol-water partition coefficient dataset using deep learning techniques and data augmentation
    Ulrich, Nadin
    Goss, Kai-Uwe
    Ebert, Andrea
    COMMUNICATIONS CHEMISTRY, 2021, 4 (01)
  • [44] Localized Motion Artifact Reduction on Brain MRI Using Deep Learning with Effective Data Augmentation Techniques
    Zhao, Yijun
    Ossowski, Jacek
    Wang, Xuming
    Li, Shangjin
    Devinsky, Orrin
    Martin, Samantha P.
    Pardoe, Heath R.
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [45] Data augmentation based face anti-spoofing (FAS) scheme using deep learning techniques
    Lakshminarasimha, Kasetty
    Selvan, V. Ponniyin
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (05) : 7389 - 7405
  • [46] Data Augmentation on Plant Leaf Disease Image Dataset Using Image Manipulation and Deep Learning Techniques
    Pandian, Arun J.
    Geetharamani, G.
    Annette, B.
    PROCEEDINGS OF THE 2019 IEEE 9TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (IACC 2019), 2019, : 199 - 204
  • [47] Improving deep learning-based polyp detection using feature extraction and data augmentation
    Yung-Chien Chou
    Chao-Chun Chen
    Multimedia Tools and Applications, 2023, 82 : 16817 - 16837
  • [48] Improving deep learning-based polyp detection using feature extraction and data augmentation
    Chou, Yung-Chien
    Chen, Chao-Chun
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (11) : 16817 - 16837
  • [49] GANs-Based Data Augmentation for Citrus Disease Severity Detection Using Deep Learning
    Zeng, Qingmao
    Ma, Xinhui
    Cheng, Baoping
    Zhou, Erxun
    Pang, Wei
    IEEE ACCESS, 2020, 8 : 172882 - 172891
  • [50] DATA AUGMENTATION FOR SHIP DETECTION USING KOMPSAT-5 IMAGES AND DEEP LEARNING MODEL
    Lee, Seung-Jae
    Chang, Jae-Young
    Lee, Kwang-Jae
    Oh, Kwan-Young
    IGARSS 2020 - 2020 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2020, : 2851 - 2854