Identification of Sarcasm in Textual Data: A Comparative Study

被引：0

作者：

Pulkit Mehndiratta

Devpriya Soni

机构：

[1] JaypeeInstituteofInformationTechnology

来源：

Journal of Data and Information Science. | 2019年 / 4卷 / 04期

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Purpose: Ever increasing penetration of the Internet in our lives has led to an enormous amount of multimedia content generation on the internet. Textual data contributes a major share towards data generated on the world wide web. Understanding people's sentiment is an important aspect of natural language processing, but this opinion can be biased and incorrect, if people use sarcasm while commenting, posting status updates or reviewing any product or a movie. Thus, it is of utmost importance to detect sarcasm correctly and make a correct prediction about the people's intentions.Design/methodology/approach: This study tries to evaluate various machine learning models along with standard and hybrid deep learning models across various standardized datasets. We have performed vectorization of text using word embedding techniques. This has been done to convert the textual data into vectors for analytical purposes. We have used three standardized datasets available in public domain and used three word embeddings i.e Word2 Vec, GloVe and fastText to validate the hypothesis.Findings: The results were analyzed and conclusions are drawn. The key finding is: the hybrid models that include Bidirectional LongTerm Short Memory(Bi-LSTM) and Convolutional Neural Network(CNN) outperform others conventional machine learning as well as deep learning models across all the datasets considered in this study, making our hypothesis valid.Research limitations: Using the data from different sources and customizing the models according to each dataset, slightly decreases the usability of the technique. But, overall this methodology provides effective measures to identify the presence of sarcasm with a minimum average accuracy of 80% or above for one dataset and better than the current baseline results for the other datasets.Practical implications: The results provide solid insights for the system developers to integrate this model into real-time analysis of any review or comment posted in the public domain. This study has various other practical implications for businesses that depend on user ratings and public opinions. This study also provides a launching platform for various researchers to work on the problem of sarcasm identification in textual data.Originality/value: This is a first of its kind study, to provide us the difference between conventional and the hybrid methods of prediction of sarcasm in textual data. The study also provides possible indicators that hybrid models are better when applied to textual data for analysis of sarcasm.

引用

页数：28

共 50 条

[41] Sarcasm Identification and Detection in Conversion Context using BERT
Kalaivani, A.
Thenmozhi, D.
[J]. FIGURATIVE LANGUAGE PROCESSING, 2020, : 72 - 76
[42] A Comparative Study of Persuasion and Identification
黎琴
[J]. 读与写(教育教学刊), 2013, 10 (12) : 3 - 4
[43] A comparative study on subband identification
Marelli, D
Fu, MY
[J]. PROCEEDINGS OF THE 39TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5, 2000, : 2409 - 2414
[44] A Comparative Study on Textual Cohesion in Chinese and English Text of Moment in Peking
Zu, Lin
Shen, Qingjing
[J]. 2016 6TH INTERNATIONAL CONFERENCE ON EDUCATION AND SPORTS EDUCATION (ESE 2016), PT 1, 2016, 51 : 237 - 242
[45] Applying a Comparative Study of Textual Genres to Learning Business French and Translation
Carmona Sandoval, Alejandro
[J]. CEDILLE-REVISTA DE ESTUDIOS FRANCESES, 2013, (09): : 69 - 82
[46] The Weight of Water: Some Implications of Textual Fluidity for the Study of Comparative Literature
Purdy, Elizabeth
[J]. TRANS-REVUE DE LITTERATURE GENERALE ET COMPAREE, 2021, 27
[47] A Comparative Study of Textual View between New Criticism and New Historicism
代向丽
[J]. 校园英语, 2017, (09) : 174 - 175
[48] Textual data analysis
Garnier, Benedicte
[J]. POPULATION, 2020, 75 (04): : 630 - 631
[49] Exploring textual data
Biber, D
[J]. COMPUTATIONAL LINGUISTICS, 1999, 25 (01) : 165 - 166
[50] Textual genre analysis and identification
Kaufer, D
Geisler, C
Ishizaki, S
Vlachos, P
[J]. AMBIENT INTELLIGENCE FOR SCIENTIFIC DISCOVERY: FOUNDATIONS, THEORIES, AND SYSTEMS, 2005, 3345 : 129 - 151

← 1 2 3 4 5 →