Using Transformer Based Ensemble Learning to Classify Scientific Articles

被引:4
|
作者
Ghosh, Sohom [1 ]
Chopra, Ankush [1 ]
机构
[1] Fidel Investments, Artificial Intelligence, CoE, Bengaluru, Karnataka, India
关键词
Scientific text classification; Ensemble learning; Transformers;
D O I
10.1007/978-3-030-75015-2_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many time reviewers fail to appreciate novel ideas of a researcher and provide generic feedback. Thus, proper assignment of reviewers based on their area of expertise is necessary. Moreover, reading each and every paper from end-to-end for assigning it to a reviewer is a tedious task. In this paper, we describe a system which our team FideLIPI submitted in the shared task of SDPRA-2021 (https://sdpra2021.github.io/website/ (accessed January 25, 2021)) [14]. It comprises four independent sub-systems capable of classifying abstracts of scientific literature to one of the given seven classes. The first one is a RoBERTa [10] based model built over these abstracts. Adding topic models/Latent dirichlet allocation (LDA) [2] based features to the first model results in the second sub-system. The third one is a sentence level RoBERTa [10] model. The fourth one is a Logistic Regression model built using Term Frequency Inverse Document Frequency (TF-IDF) features. We ensemble predictions of these four sub-systems using majority voting to develop the final system which gives a F1 score of 0.93 on the test and validation set. This outperforms the existing State Of The Art (SOTA) model SciBERT's [1] in terms of F1 score on the validation set. Our codebase is available at https://github.com/SDPRA-2021/shared-task/tree/main/FideLIPI.
引用
收藏
页码:106 / 113
页数:8
相关论文
共 50 条
  • [21] Combining meta and ensemble learning to classify EEG for seizure detection
    Liu, Mingze
    Liu, Jie
    Xu, Mengna
    Liu, Yasheng
    Li, Jie
    Nie, Weiwei
    Yuan, Qi
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [22] An Effective Ensemble Machine Learning Approach to Classify Breast Cancer Based on Feature Selection and Lesion Segmentation Using Preprocessed Mammograms
    Rafid, A. K. M. Rakibul Haque
    Azam, Sami
    Montaha, Sidratul
    Karim, Asif
    Fahim, Kayes Uddin
    Hasan, Md. Zahid
    BIOLOGY-BASEL, 2022, 11 (11):
  • [23] Using an Ensemble to Identify and Classify Macroalgae Antimicrobial Peptides
    Caprani, Michela Chiara
    Healy, John
    Slattery, Orla
    O'Keeffe, Joan
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2021, 13 (02) : 321 - 333
  • [24] Using an Ensemble to Identify and Classify Macroalgae Antimicrobial Peptides
    Michela Chiara Caprani
    John Healy
    Orla Slattery
    Joan O’Keeffe
    Interdisciplinary Sciences: Computational Life Sciences, 2021, 13 : 321 - 333
  • [25] Transformer-based embedding applied to classify bacterial species using sequencing reads
    Gwak, Ho-Jin
    Rho, Mina
    2022 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (IEEE BIGCOMP 2022), 2022, : 374 - 377
  • [26] A Classification Method of Scientific Collaborator Potential Prediction Based on Ensemble Learning
    Ai K.
    Ma G.
    Yang K.
    Qian Y.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2019, 56 (07): : 1383 - 1395
  • [27] Citation Polarity Identification in Scientific Research Articles Using Deep Learning Methods
    Kundu, Souvik
    Mercer, Robert E.
    DEEP LEARNING THEORY AND APPLICATIONS, PT I, DELTA 2024, 2024, 2171 : 277 - 295
  • [28] Exploiting Transformer-Based Multitask Learning for the Detection of Media Bias in News Articles
    Spinde, Timo
    Krieger, Jan-David
    Ruas, Terry
    Mitrovic, Jelena
    Goetz-Hahn, Franz
    Aizawa, Akiko
    Gipp, Bela
    INFORMATION FOR A BETTER WORLD: SHAPING THE GLOBAL FUTURE, PT I, 2022, 13192 : 225 - 235
  • [29] Transformer based ensemble for emotion detection
    Kane, Aditya
    Patankar, Shantanu
    Khose, Sahil
    Kirtane, Neeraja
    PROCEEDINGS OF THE 12TH WORKSHOP ON COMPUTATIONAL APPROACHES TO SUBJECTIVITY, SENTIMENT & SOCIAL MEDIA ANALYSIS, 2022, : 250 - 254
  • [30] Recommending Scientific Articles Using CiteULike
    Bogers, Toine
    van den Bosch, Antal
    RECSYS'08: PROCEEDINGS OF THE 2008 ACM CONFERENCE ON RECOMMENDER SYSTEMS, 2008, : 287 - 290