Using Transformer Based Ensemble Learning to Classify Scientific Articles

被引:4
|
作者
Ghosh, Sohom [1 ]
Chopra, Ankush [1 ]
机构
[1] Fidel Investments, Artificial Intelligence, CoE, Bengaluru, Karnataka, India
关键词
Scientific text classification; Ensemble learning; Transformers;
D O I
10.1007/978-3-030-75015-2_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many time reviewers fail to appreciate novel ideas of a researcher and provide generic feedback. Thus, proper assignment of reviewers based on their area of expertise is necessary. Moreover, reading each and every paper from end-to-end for assigning it to a reviewer is a tedious task. In this paper, we describe a system which our team FideLIPI submitted in the shared task of SDPRA-2021 (https://sdpra2021.github.io/website/ (accessed January 25, 2021)) [14]. It comprises four independent sub-systems capable of classifying abstracts of scientific literature to one of the given seven classes. The first one is a RoBERTa [10] based model built over these abstracts. Adding topic models/Latent dirichlet allocation (LDA) [2] based features to the first model results in the second sub-system. The third one is a sentence level RoBERTa [10] model. The fourth one is a Logistic Regression model built using Term Frequency Inverse Document Frequency (TF-IDF) features. We ensemble predictions of these four sub-systems using majority voting to develop the final system which gives a F1 score of 0.93 on the test and validation set. This outperforms the existing State Of The Art (SOTA) model SciBERT's [1] in terms of F1 score on the validation set. Our codebase is available at https://github.com/SDPRA-2021/shared-task/tree/main/FideLIPI.
引用
收藏
页码:106 / 113
页数:8
相关论文
共 50 条
  • [31] A Deep Learning Ensemble Method for Forecasting Daily Crude Oil Price Based on Snapshot Ensemble of Transformer Model
    Fathalla A.
    Alameer Z.
    Abbas M.
    Ali A.
    Comput Syst Sci Eng, 2023, 1 (929-950): : 929 - 950
  • [32] Using Images to Enliven Scientific Articles
    Hubbe, Martin A.
    Milian, Adriana
    BIORESOURCES, 2023, 18 (01) : 1 - 3
  • [33] Machine learning ensemble modelling to classify caesarean section and vaginal delivery types using Cardiotocography traces
    Fergus, Paul
    Selvaraj, Malarvizhi
    Chalmers, Carl
    COMPUTERS IN BIOLOGY AND MEDICINE, 2018, 93 : 7 - 16
  • [34] Transformer-based ensemble deep learning model for EEG-based emotion recognition
    Xiaopeng Si
    Dong Huang
    Yulin Sun
    Shudi Huang
    He Huang
    Dong Ming
    Brain Science Advances, 2023, 9 (03) : 210 - 223
  • [35] A Stacking Ensemble Learning Method to Classify the Patterns of Complex Road Junctions
    Yang, Min
    Cheng, Lingya
    Cao, Minjun
    Yan, Xiongfeng
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2022, 11 (10)
  • [36] Scientific text citation analysis using CNN features and ensemble learning model
    Alnowaiser, Khaled
    PLOS ONE, 2024, 19 (05):
  • [37] Semi-automatic Labelling of Scientific Articles using Deep Learning to Enlarge Benchmark Data for Scientific Summarization
    El-Ebshihy, Alaa
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 2707 - 2707
  • [38] Real-time prediction of TBM penetration rates using a transformer-based ensemble deep learning model
    Zhang, Minggong
    Ji, Ankang
    Zhou, Chang
    Ding, Yuexiong
    Wang, Luqi
    AUTOMATION IN CONSTRUCTION, 2024, 168
  • [39] Ensemble learning using multivariate variational mode decomposition based on the Transformer for multi-step-ahead streamflow forecasting
    Fang, Jinjie
    Yang, Linshan
    Wen, Xiaohu
    Yu, Haijiao
    Li, Weide
    Adamowski, Jan F.
    Barzegar, Rahim
    JOURNAL OF HYDROLOGY, 2024, 636
  • [40] ENSEMBLE LEARNING WITH RESIDUAL TRANSFORMER FOR BRAIN TUMOR SEGMENTATION
    Yao, Lanhong
    Zhang, Zheyuan
    Bagci, Ulas
    2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,