An ensemble model for classifying idioms and literal texts using BERT and RoBERTa

被引:55
|
作者
Briskilal, J. [1 ]
Subalalitha, C. N. [1 ]
机构
[1] SRM Inst Sci & Technol, Chengalpattu, Tamil Nadu, India
关键词
BERT; RoBERTa; Ensemble model; Idiom; Literal classification;
D O I
10.1016/j.ipm.2021.102756
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An idiom is a common phrase that means something other than its literal meaning. Detecting idioms automatically is a serious challenge in natural language processing (NLP) domain appli-cations like information retrieval (IR), machine translation and chatbot. Automatic detection of Idioms plays an important role in all these applications. A fundamental NLP task is text classi-fication, which categorizes text into structured categories known as text labeling or categoriza-tion. This paper deals with idiom identification as a text classification task. Pre-trained deep learning models have been used for several text classification tasks; though models like BERT and RoBERTa have not been exclusively used for idiom and literal classification. We propose a pre-dictive ensemble model to classify idioms and literals using BERT and RoBERTa, fine-tuned with the TroFi dataset. The model is tested with a newly created in house dataset of idioms and literal expressions, numbering 1470 in all, and annotated by domain experts. Our model outperforms the baseline models in terms of the metrics considered, such as F-score and accuracy, with a 2% improvement in accuracy.
引用
下载
收藏
页数:9
相关论文
共 50 条
  • [41] Arabic Sentiment Analysis Using BERT Model
    Chouikhi, Hasna
    Chniter, Hamza
    Jarray, Fethi
    ADVANCES IN COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2021), 2021, 1463 : 621 - 632
  • [42] Classifying ductal trees using geometrical features and ensemble learning techniques
    Skoura, A. (skoura@ceid.upatras.gr), 1600, Springer Verlag (384):
  • [43] Classifying Ductal Trees Using Geometrical Features and Ensemble Learning Techniques
    Skoura, Angeliki
    Nuzhnaya, Tatyana
    Bakic, Predrag R.
    Megalooikonomou, Vasilis
    ENGINEERING APPLICATIONS OF NEURAL NETWORKS, PT II, 2013, 384 : 146 - 155
  • [44] BERT-Based Logits Ensemble Model for Gender Bias and Hate Speech Detection
    Yun, Sanggeon
    Kang, Seungshik
    Kim, Hyeokman
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2023, 19 (05): : 641 - 651
  • [45] BERT-Based Ensemble Model for Statute Law Retrieval and Legal Information Entailment
    Shao, Hsuan-Lei
    Chen, Yi-Chia
    Huang, Sieh-Chuen
    NEW FRONTIERS IN ARTIFICIAL INTELLIGENCE, JSAI-ISAI 2020, 2021, 12758 : 226 - 239
  • [46] Information Extraction from Medical Texts with BERT Using Human-in-the-Loop Labeling
    Suvalov, Hendrik
    Laur, Sven
    Kolde, Raivo
    CARING IS SHARING-EXPLOITING THE VALUE IN DATA FOR HEALTH AND INNOVATION-PROCEEDINGS OF MIE 2023, 2023, 302 : 831 - 832
  • [47] A comparative evaluation for question answering over Greek texts by using machine translation and BERT
    Mountantonakis, Michalis
    Mertzanis, Loukas
    Bastakis, Michalis
    Tzitzikas, Yannis
    LANGUAGE RESOURCES AND EVALUATION, 2024,
  • [48] A Novel Multi-Class Ensemble Model for Classifying Imbalanced Biomedical Datasets
    ThulasiBikku
    Rao, Sambasiva
    Akepogu, Ananda Rao
    INTERNATIONAL CONFERENCE ON MATERIALS, ALLOYS AND EXPERIMENTAL MECHANICS (ICMAEM-2017), 2017, 225
  • [49] An Ensemble Keyword Extraction Model for News Texts with Statistical and Graphical Features
    Abibullayeva, Aiman
    Kilic, Huma
    Cetin, Aydin
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2024, 34 (07) : 1047 - 1061
  • [50] Improving sentiment classification using a RoBERTa-based hybrid model
    Semary, Noura A.
    Ahmed, Wesam
    Amin, Khalid
    Plawiak, Pawel
    Hammad, Mohamed
    FRONTIERS IN HUMAN NEUROSCIENCE, 2023, 17