An ensemble model for classifying idioms and literal texts using BERT and RoBERTa

被引：55

作者：

Briskilal, J. ^{[1
]}

Subalalitha, C. N. ^{[1
]}

机构：

[1] SRM Inst Sci & Technol, Chengalpattu, Tamil Nadu, India

来源：

INFORMATION PROCESSING & MANAGEMENT | 2022年 / 59卷 / 01期

关键词：

BERT; RoBERTa; Ensemble model; Idiom; Literal classification;

D O I：

10.1016/j.ipm.2021.102756

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

An idiom is a common phrase that means something other than its literal meaning. Detecting idioms automatically is a serious challenge in natural language processing (NLP) domain appli-cations like information retrieval (IR), machine translation and chatbot. Automatic detection of Idioms plays an important role in all these applications. A fundamental NLP task is text classi-fication, which categorizes text into structured categories known as text labeling or categoriza-tion. This paper deals with idiom identification as a text classification task. Pre-trained deep learning models have been used for several text classification tasks; though models like BERT and RoBERTa have not been exclusively used for idiom and literal classification. We propose a pre-dictive ensemble model to classify idioms and literals using BERT and RoBERTa, fine-tuned with the TroFi dataset. The model is tested with a newly created in house dataset of idioms and literal expressions, numbering 1470 in all, and annotated by domain experts. Our model outperforms the baseline models in terms of the metrics considered, such as F-score and accuracy, with a 2% improvement in accuracy.

引用

下载

页数：9

共 50 条

[1] An ensemble model for idioms and literal text classification using knowledge-enabled BERT in deep learning
Abarna S.
Sheeba J.I.
Devaneyan S.P.
Measurement: Sensors, 2022, 24
[2] Arabic ChatGPT Tweets Classification Using RoBERTa and BERT Ensemble Model
mujahid, Muhammad
Kanwal, Khadija
Rustam, Furqan
Aljadani, Wajdi
Ashraf, Imran
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (08)
[3] Personality Identification from Social Media Using Ensemble BERT and RoBERTa
Tsani E.F.
Suhartono D.
Informatica (Slovenia), 2023, 47 (04): : 537 - 544
[4] Analyzing the Performance of Sentiment Analysis using BERT, DistilBERT, and RoBERTa
Joshy, Archa
Sundar, Sumod
2022 IEEE INTERNATIONAL POWER AND RENEWABLE ENERGY CONFERENCE, IPRECON, 2022,
[5] The Classification of Short Scientific Texts Using Pretrained BERT Model
Danilov, Gleb
Ishankulov, Timur
Kotik, Konstantin
Orlov, Yuriy
Shifrin, Mikhail
Potapov, Alexander
PUBLIC HEALTH AND INFORMATICS, PROCEEDINGS OF MIE 2021, 2021, 281 : 83 - 87
[6] Classifying Texts with KACC Model
Li Y.
Chen Z.
Xu F.
Data Analysis and Knowledge Discovery, 2019, 3 (10) : 89 - 97
[7] BERT, XLNet or RoBERTa: The Best Transfer Learning Model to Detect Clickbaits
Rajapaksha, Praboda
Farahbakhsh, Reza
Crespi, Noel
IEEE ACCESS, 2021, 9 : 154704 - 154716
[8] EMOTION DETECTION FROM TWEETS USING A BERT AND SVM ENSEMBLE MODEL
ALBU, Ionut-Alexandru
SPINU, Stelian
UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2022, 84 (01): : 63 - 74
[9] Deep-BERT: Transfer Learning for Classifying Multilingual Offensive Texts on Social Media
Wadud, Md Anwar Hussen
Mridha, M. F.
Shin, Jungpil
Nur, Kamruddin
Saha, Aloke Kumar
COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2023, 44 (02): : 1775 - 1791
[10] On the Role of Text Preprocessing in BERT Embedding-based DNNs for Classifying Informal Texts
Kurniasih A.
Manik L.P.
International Journal of Advanced Computer Science and Applications, 2022, 13 (06) : 927 - 934

← 1 2 3 4 5 →