DeepCKID: A Multi-Head Attention-Based Deep Neural Network Model Leveraging Classwise Knowledge to Handle Imbalanced Textual Data

被引:0
|
作者
Sah, Amit Kumar [1 ]
Abulaish, Muhammad [1 ]
机构
[1] South Asian Univ, Dept Comp Sci, New Delhi, India
来源
关键词
Class imbalance; Text classification; Transformers; Deep learning; Multi-Head Attention; Pre-trained Language Models;
D O I
10.1016/j.mlwa.2024.100575
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents DeepCKID, a Multi-Head Attention (MHA)-based deep learning model that exploits statistical and semantic knowledge corresponding to documents across different classes in the datasets to improve the model's ability to detect minority class instances in imbalanced text classification. In this process, corresponding to each document, DeepCKID extracts - (i) word-level statistical and semantic knowledge, namely, class correlation and class similarity corresponding to each word, based on its association with different classes in the dataset, and (ii) class-level knowledge from the document using n-grams and relation triplets corresponding to classwise keywords present, identified using cosine similarity utilizing Transformers-based Pre-trained Language Models (PLMs). DeepCKID encodes the word-level and class-level features using deep convolutional networks, which can learn meaningful patterns from them. At first, DeepCKID combines the semantically meaningful Sentence-BERT document embeddings and word-level feature matrix to give the final document representation, which it further fuses to the different classwise encoded representations to strengthen feature propagation. DeepCKID then passes the encoded document representation and its different classwise representations through an MHA layer to identify the important features at different positions of the feature subspaces, resulting in a latent dense vector accentuating its association with a particular class. Finally, DeepCKID passes the latent vector to the softmax layer to learn the corresponding class label. We evaluate DeepCKID over six publicly available Amazon reviews datasets using four Transformers-based PLMs. We compare DeepCKID with three approaches and four ablation-like baselines. Our study suggests that in most cases, DeepCKID outperforms all the comparison approaches, including baselines.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Multi-Head Attention-Based Hybrid Deep Neural Network for Aeroengine Risk Assessment
    Li, Jian-Hang
    Gao, Xin-Yue
    Lu, Xiang
    Liu, Guo-Dong
    [J]. IEEE ACCESS, 2023, 11 : 113376 - 113389
  • [2] An Improved Model for Analyzing Textual Sentiment Based on a Deep Neural Network Using Multi-Head Attention Mechanism
    Sharaf Al-deen, Hashem Saleh
    Zeng, Zhiwen
    Al-sabri, Raeed
    Hekmat, Arash
    [J]. APPLIED SYSTEM INNOVATION, 2021, 4 (04)
  • [3] Data-driven fiber model based on the deep neural network with multi-head attention mechanism
    Zang, Yubin
    Yu, Zhenming
    Xu, Kun
    Chen, Minghua
    Yang, Sigang
    Chen, Hongwei
    [J]. OPTICS EXPRESS, 2022, 30 (26) : 46626 - 46648
  • [4] Multi-head attention-based model for reconstructing continuous missing time series data
    Wu, Huafeng
    Zhang, Yuxuan
    Liang, Linian
    Mei, Xiaojun
    Han, Dezhi
    Han, Bing
    Weng, Tien-Hsiung
    Li, Kuan-Ching
    [J]. JOURNAL OF SUPERCOMPUTING, 2023, 79 (18): : 20684 - 20711
  • [5] Multi-head attention-based model for reconstructing continuous missing time series data
    Huafeng Wu
    Yuxuan Zhang
    Linian Liang
    Xiaojun Mei
    Dezhi Han
    Bing Han
    Tien-Hsiung Weng
    Kuan-Ching Li
    [J]. The Journal of Supercomputing, 2023, 79 : 20684 - 20711
  • [6] A Reverse Positional Encoding Multi-Head Attention-Based Neural Machine Translation Model for Arabic Dialects
    Baniata, Laith H.
    Kang, Sangwoo
    Ampomah, Isaac K. E.
    [J]. MATHEMATICS, 2022, 10 (19)
  • [7] Enhancing Recommendation Capabilities Using Multi-Head Attention-Based Federated Knowledge Distillation
    Wu, Aming
    Kwon, Young-Woo
    [J]. IEEE ACCESS, 2023, 11 : 45850 - 45861
  • [8] Self Multi-Head Attention-based Convolutional Neural Networks for fake news detection
    Fang, Yong
    Gao, Jian
    Huang, Cheng
    Peng, Hua
    Wu, Runpu
    [J]. PLOS ONE, 2019, 14 (09):
  • [9] A Novel Knowledge Tracing Model Based on Collaborative Multi-Head Attention
    Zhang Wei
    Qu Kaiyuan
    Han Yahui
    Tan Longan
    [J]. 6TH INTERNATIONAL CONFERENCE ON INNOVATION IN ARTIFICIAL INTELLIGENCE, ICIAI2022, 2022, : 210 - 215
  • [10] Multiscaled Multi-Head Attention-Based Video Transformer Network for Hand Gesture Recognition
    Garg, Mallika
    Ghosh, Debashis
    Pradhan, Pyari Mohan
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 80 - 84