Using Transformer Based Ensemble Learning to Classify Scientific Articles

被引:4
|
作者
Ghosh, Sohom [1 ]
Chopra, Ankush [1 ]
机构
[1] Fidel Investments, Artificial Intelligence, CoE, Bengaluru, Karnataka, India
关键词
Scientific text classification; Ensemble learning; Transformers;
D O I
10.1007/978-3-030-75015-2_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many time reviewers fail to appreciate novel ideas of a researcher and provide generic feedback. Thus, proper assignment of reviewers based on their area of expertise is necessary. Moreover, reading each and every paper from end-to-end for assigning it to a reviewer is a tedious task. In this paper, we describe a system which our team FideLIPI submitted in the shared task of SDPRA-2021 (https://sdpra2021.github.io/website/ (accessed January 25, 2021)) [14]. It comprises four independent sub-systems capable of classifying abstracts of scientific literature to one of the given seven classes. The first one is a RoBERTa [10] based model built over these abstracts. Adding topic models/Latent dirichlet allocation (LDA) [2] based features to the first model results in the second sub-system. The third one is a sentence level RoBERTa [10] model. The fourth one is a Logistic Regression model built using Term Frequency Inverse Document Frequency (TF-IDF) features. We ensemble predictions of these four sub-systems using majority voting to develop the final system which gives a F1 score of 0.93 on the test and validation set. This outperforms the existing State Of The Art (SOTA) model SciBERT's [1] in terms of F1 score on the validation set. Our codebase is available at https://github.com/SDPRA-2021/shared-task/tree/main/FideLIPI.
引用
收藏
页码:106 / 113
页数:8
相关论文
共 50 条
  • [1] Using hedges to classify citations in scientific articles
    Di Marco, C
    Kroon, FW
    Mercer, RE
    COMPUTING ATTITUDE AND AFFECT IN TEXT: THEORY AND APPLICATIONS, 2006, 20 : 247 - +
  • [2] Using Graph-Based Ensemble Learning to Classify Imbalanced Data
    Qin, Anyong
    Shang, Zhaowei
    Tian, Jinyu
    Zhang, Taiping
    Wang, Yulong
    Tang, Yuan Yan
    2017 3RD IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS (CYBCONF), 2017, : 265 - 270
  • [3] A Machine Learning Approach to classify News Articles based on Location
    Rao, Vignesh
    Sachdev, Jayant
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTELLIGENT SUSTAINABLE SYSTEMS (ICISS 2017), 2017, : 863 - 867
  • [4] Breast Cancer Detection Using Transformer and BiLSTM Based Ensemble Learning
    Yilmaz, Rabia Eda
    Serbes, Görkem
    2023 31ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2023,
  • [5] Classification of EEG signals using Transformer based deep learning and ensemble models
    Zeynali, Mahsa
    Seyedarabi, Hadi
    Afrouzian, Reza
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 86
  • [6] Transformer Fault Diagnosis Based on Stacking Ensemble Learning
    Wang, Xue
    Han, Tao
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2020, 15 (12) : 1734 - 1739
  • [7] Power Transformer Fault Diagnosis Based on Ensemble Learning
    Zhou, Wei
    Li, Yang
    2024 IEEE 2ND INTERNATIONAL CONFERENCE ON POWER SCIENCE AND TECHNOLOGY, ICPST 2024, 2024, : 1070 - 1075
  • [8] Using Transfer Learning, SVM and Ensemble Classification to classify Baby Cries based on heir Spectrogram Images
    Le, Lillian
    Kabir, Abu Nadim M. H.
    Ji, Chunyan
    Basodi, Sunitha
    Pan, Yi
    2019 IEEE 16TH INTERNATIONAL CONFERENCE ON MOBILE AD HOC AND SENSOR SYSTEMS WORKSHOPS (MASSW 2019), 2019, : 106 - 110
  • [9] Using microscopic imaging and ensemble deep learning to classify the provenance of archaeological ceramics
    Qian Wang
    Xuan Xiao
    Zi Liu
    Scientific Reports, 14 (1)
  • [10] Classify Blog Articles Using Queried Keywords
    Chen, Yi-Hui
    Lu, Eric Jui-Lin
    Wu, Tsai Ying
    Lin, Tsung Hau
    INTELLIGENT SYSTEMS AND APPLICATIONS (ICS 2014), 2015, 274 : 1062 - 1068