Use of Data Augmentation Techniques in Detection of Antisocial Behavior Using Deep Learning Methods

被引:7
|
作者
Maslej-Kresnakova, Viera [1 ]
Sarnovsky, Martin [1 ]
Jackova, Julia [1 ]
机构
[1] Tech Univ Kosice, Fac Elect Engn & Informat, Dept Cybernet & Artificial Intelligence, Kosice 04001, Slovakia
来源
FUTURE INTERNET | 2022年 / 14卷 / 09期
关键词
data augmentation; FDA; deep learning; antisocial behavior; fake news detection; toxic comments;
D O I
10.3390/fi14090260
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The work presented in this paper focuses on the use of data augmentation techniques applied in the domain of the detection of antisocial behavior. Data augmentation is a frequently used approach to overcome issues related to the lack of data or problems related to imbalanced classes. Such techniques are used to generate artificial data samples used to improve the volume of the training set or to balance the target distribution. In the antisocial behavior detection domain, we frequently face both issues, the lack of quality labeled data as well as class imbalance. As the majority of the data in this domain is textual, we must consider augmentation methods suitable for NLP tasks. Easy data augmentation (EDA) represents a group of such methods utilizing simple text transformations to create the new, artificial samples. Our main motivation is to explore EDA techniques' usability on the selected tasks from the antisocial behavior detection domain. We focus on the class imbalance problem and apply EDA techniques to two problems: fake news and toxic comments classification. In both cases, we train the convolutional neural networks classifier and compare its performance on the original and EDA-extended datasets. EDA techniques prove to be very task-dependent, with certain limitations resulting from the data they are applied on. The model's performance on the extended toxic comments dataset did improve only marginally, gaining only 0.01 improvement in the F1 metric when applying only a subset of EDA methods. EDA techniques in this case were not suitable enough to handle texts written in more informal language. On the other hand, on the fake news dataset, the performance was improved more significantly, boosting the F1 score by 0.1. Improvement was most significant in the prediction of the minor class, where F1 improved from 0.67 to 0.86.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Enhancing a Deep Learning Model for the Steam Reforming Process Using Data Augmentation Techniques
    Pizon, Zofia
    Kimijima, Shinji
    Brus, Grzegorz
    ENERGIES, 2024, 17 (10)
  • [22] Automated segmentation of dental restorations using deep learning: exploring data augmentation techniques
    Celik, Berrin
    Baslak, Muhammed Emin
    Genc, Mehmet Zahid
    Celik, Mahmut Emin
    ORAL RADIOLOGY, 2025, 41 (02) : 207 - 215
  • [23] Data Augmentation Techniques to Detect Cervical Cancer Using Deep Learning: A Systematic Review
    Wubineh, Betelhem Zewdu
    Rusiecki, Andrzej
    Halawa, Krzysztof
    SYSTEM DEPENDABILITY-THEORY AND APPLICATIONS, DEPCOS-RELCOMEX 2024, 2024, 1026 : 325 - 336
  • [24] Augmentation techniques for sequential clinical data to improve Deep Learning prediction techniques
    Florez, Alexander Y. C.
    Scabora, Lucas
    Amer-Yahia, Sihem
    Rodrigues-Jr, Jose F.
    2020 IEEE 33RD INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS(CBMS 2020), 2020, : 597 - 602
  • [25] The Impact of Augmentation Techniques on Icon Detection Using Machine Learning Techniques
    Dicu, Madalina
    Chira, Camelia
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, PT II, AIAI 2024, 2024, 712 : 318 - 331
  • [26] A Proposed Approach for Object Detection and Recognition by Deep Learning Models Using Data Augmentation
    Abdulkareem, Ismael M.
    AL-Shammri, Faris K.
    Khalid, Noor Aldeen A.
    Omran, Natiq A.
    INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2024, 20 (05) : 31 - 43
  • [27] Exploring the octanol–water partition coefficient dataset using deep learning techniques and data augmentation
    Nadin Ulrich
    Kai-Uwe Goss
    Andrea Ebert
    Communications Chemistry, 4
  • [28] Molecular communication data augmentation and deep learning based detection
    Scazzoli, Davide
    Vakilipoor, Fardad
    Magarini, Maurizio
    NANO COMMUNICATION NETWORKS, 2024, 40
  • [29] Combining data augmentation and deep learning for improved epilepsy detection
    Ru, Yandong
    Wei, Zheng
    An, Gaoyang
    Chen, Hongming
    FRONTIERS IN NEUROLOGY, 2024, 15
  • [30] Defect Detection Methods for Industrial Products Using Deep Learning Techniques: A Review
    Saberironaghi, Alireza
    Ren, Jing
    El-Gindy, Moustafa
    ALGORITHMS, 2023, 16 (02)