Using Transformer Based Ensemble Learning to Classify Scientific Articles

被引:4
|
作者
Ghosh, Sohom [1 ]
Chopra, Ankush [1 ]
机构
[1] Fidel Investments, Artificial Intelligence, CoE, Bengaluru, Karnataka, India
关键词
Scientific text classification; Ensemble learning; Transformers;
D O I
10.1007/978-3-030-75015-2_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many time reviewers fail to appreciate novel ideas of a researcher and provide generic feedback. Thus, proper assignment of reviewers based on their area of expertise is necessary. Moreover, reading each and every paper from end-to-end for assigning it to a reviewer is a tedious task. In this paper, we describe a system which our team FideLIPI submitted in the shared task of SDPRA-2021 (https://sdpra2021.github.io/website/ (accessed January 25, 2021)) [14]. It comprises four independent sub-systems capable of classifying abstracts of scientific literature to one of the given seven classes. The first one is a RoBERTa [10] based model built over these abstracts. Adding topic models/Latent dirichlet allocation (LDA) [2] based features to the first model results in the second sub-system. The third one is a sentence level RoBERTa [10] model. The fourth one is a Logistic Regression model built using Term Frequency Inverse Document Frequency (TF-IDF) features. We ensemble predictions of these four sub-systems using majority voting to develop the final system which gives a F1 score of 0.93 on the test and validation set. This outperforms the existing State Of The Art (SOTA) model SciBERT's [1] in terms of F1 score on the validation set. Our codebase is available at https://github.com/SDPRA-2021/shared-task/tree/main/FideLIPI.
引用
收藏
页码:106 / 113
页数:8
相关论文
共 50 条
  • [41] Transformer based ensemble deep learning approach for remote sensing natural scene classification
    Sivasubramanian, Arrun
    Prashanth, V. R.
    Sowmya, V
    Ravi, Vinayakumar
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2024, 45 (10) : 3289 - 3309
  • [42] Using Ensemble Models to Classify the Sentiment Expressed in Suicide Notes
    McCart, James A.
    Finch, Dezon K.
    Jarman, Jay
    Hickling, Edward
    Lind, Jason D.
    Richardson, Matthew R.
    Berndt, Donald J.
    Luther, Stephen L.
    BIOMEDICAL INFORMATICS INSIGHTS, 2012, 5 : 77 - 85
  • [43] Automated Machine Learning for Information Retrieval in Scientific Articles
    Rakhshani, Hojjat
    Latard, Bastien
    Brevilliers, Mathieu
    Weber, Jonathan
    Lepagnot, Julien
    Forestier, Germain
    Hassenforder, Michel
    Idoumghar, Lhassane
    2020 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2020,
  • [44] Adaptive Learning for Improving Semantic Tagging of Scientific Articles
    Janusz, Andrzej
    Stawicki, Sebastian
    Nguyen, Hung Son
    FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2014, 2014, 2 : 27 - 34
  • [45] Automatic extraction and learning of keyphrases from scientific articles
    HaCohen-Kerner, Y
    Gross, Z
    Masa, A
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2005, 3406 : 657 - 669
  • [46] Deep learning by Vision Transformer to classify bacterial and fungal keratitis using different types of anterior segment images
    Won, Yeo Kyoung
    Kim, Choong Han
    Jeon, Jooyoung
    Cha, Jiho
    Lim, Dong Hui
    Computers in Biology and Medicine, 2025, 190
  • [47] Learning to classify short text from scientific documents using topic models with various types of knowledge
    Vo, Duc-Thuan
    Ock, Cheol-Young
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (03) : 1684 - 1698
  • [48] Using ensemble learning and genetic algorithm on magnetic resonance imaging radiomics to classify molecular subtypes of breast cancer
    Le, Nguyen Quoc Khanh
    Ho, Dang Khanh Ngan
    Ta, Hoang Dang Khoa
    Nguyen, Hieu Trung
    PRECISION MEDICAL SCIENCES, 2023, 12 (02): : 104 - 112
  • [49] Ensemble learning approach to classify user defined functions in Java']Java programs
    Itham, Mohamed
    Kumara, B. T. G. S.
    Ekanayake, E. M. U. W. J. B.
    2021 INTERNATIONAL CONFERENCE ON DECISION AID SCIENCES AND APPLICATION (DASA), 2021,
  • [50] Using ChatGPT for language editing in scientific articles
    Kim, Seong-Gon
    MAXILLOFACIAL PLASTIC AND RECONSTRUCTIVE SURGERY, 2023, 45 (01)