Position-context additive transformer-based model for classifying text data on social media

被引:0
|
作者
M. M. Abd-Elaziz [1 ]
Nora El-Rashidy [2 ]
Ahmed Abou Elfetouh [1 ]
Hazem M. El-Bakry [1 ]
机构
[1] Mansoura University,Information Systems Department, Faculty of Computers and Information Sciences
[2] Kaferelshikh University,Machine Learning and Information Retrieval Department, Faculty of Artificial Intelligence
关键词
Social media; Transformer-based model; Word embedding; Bi-LSTM network; Additive attention;
D O I
10.1038/s41598-025-90738-1
中图分类号
学科分类号
摘要
In recent years, the continuous increase in the growth of text data on social media has been a major reason to rely on the pre-training method to develop new text classification models specially transformer-based models that have proven worthwhile in most natural language processing tasks. This paper introduces a new Position-Context Additive transformer-based model (PCA model) that consists of two-phases to increase the accuracy of text classification tasks on social media. Phase I aims to develop a new way to extract text characteristics by paying attention to the position and context of each word in the input layer. This is done by integrating the improved word embedding method (the position) with the developed Bi-LSTM network to increase the focus on the connection of each word with the other words around it (the context). As for phase II, it focuses on the development of a transformer-based model based primarily on improving the additive attention mechanism. The PCA model has been tested for the implementation of the classification of health-related social media texts in 6 data sets. Results showed that performance accuracy was improved by an increase in F1-Score between 0.2 and 10.2% in five datasets compared to the best published results. On the other hand, the performance of PCA model was compared with three transformer-based models that proved high accuracy in classifying texts, and experiments also showed that PCA model overcame the other models in 4 datasets to achieve an improvement in F1-score between 0.1 and 2.1%. The results also led us to conclude a direct correlation between the volume of training data and the accuracy of performance as the increase in the volume of training data positively affects F1-Score improvement.
引用
收藏
相关论文
共 50 条
  • [41] KETCH: A Knowledge-Enhanced Transformer-Based Approach to Suicidal Ideation Detection from Social Media Content
    Zhang, Dongsong
    Zhou, Lina
    Tao, Jie
    Zhue, Tingshao
    Gao, Guodong
    INFORMATION SYSTEMS RESEARCH, 2024,
  • [42] Context-based Knowledge Discovery and Querying for Social Media Data
    Phengsuwan, Jedsada
    Thekkummal, Nipun Balan
    Shah, Teja
    James, Philip
    Thakker, Dhavalkumar
    Sun, Rui
    Pullarkatt, Divya
    Hemalatha, T.
    Ramesh, Maneesha Vinodini
    Ranjan, Rajiv
    2019 IEEE 20TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE (IRI 2019), 2019, : 307 - 314
  • [43] Hierarchical Graph Transformer-Based Deep Learning Model for Large-Scale Multi-Label Text Classification
    Gong, Jibing
    Teng, Zhiyong
    Teng, Qi
    Zhang, Hekai
    Du, Linfeng
    Chen, Shuai
    Bhuiyan, Md Zakirul Alam
    Li, Jianhua
    Liu, Mingsheng
    Ma, Hongyuan
    IEEE ACCESS, 2020, 8 : 30885 - 30896
  • [44] A transformer-Based neural language model that synthesizes brain activation maps from free-form text queries
    Ngo, Gia H.
    Nguyen, Minh
    Chen, Nancy F.
    Sabuncu, Mert R.
    MEDICAL IMAGE ANALYSIS, 2022, 81
  • [45] A Fuzzy Logic-Based Text Classification Method for Social Media Data
    Wu, KeYuan
    Zhou, MengChu
    Lu, Xiaoyu Sean
    Huang, Li
    2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 1942 - 1947
  • [46] TransSurv: Transformer-Based Survival Analysis Model Integrating Histopathological Images and Genomic Data for Colorectal Cancer
    Lv, Zhilong
    Lin, Yuexiao
    Yan, Rui
    Wang, Ying
    Zhang, Fa
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (06) : 3411 - 3420
  • [47] Context-Based Persuasion Analysis of Sentiment Polarity Disambiguation in Social Media Text Streams
    Singh, Tajinder
    Kumari, Madhu
    Gupta, Daya Sagar
    NEW GENERATION COMPUTING, 2023, 42 (4) : 497 - 531
  • [48] Social Media Opinion Analysis Model Based on Fusion of Text and Structural Features
    Long, Jie
    Li, Zihan
    Xuan, Qi
    Fu, Chenbo
    Peng, Songtao
    Min, Yong
    APPLIED SCIENCES-BASEL, 2023, 13 (12):
  • [49] Integrating Non-Fourier and AST-Structural Relative Position Representations Into Transformer-Based Model for Source Code Summarization
    Liang, Hsiang-Mei
    Huang, Chin-Yu
    IEEE Access, 2024, 12 : 9871 - 9889
  • [50] Integrating Non-Fourier and AST-Structural Relative Position Representations Into Transformer-Based Model for Source Code Summarization
    Liang, Hsiang-Mei
    Huang, Chin-Yu
    IEEE ACCESS, 2024, 12 : 9871 - 9889