Identification of Sarcasm in Textual Data: A Comparative Study

被引:0
|
作者
Pulkit Mehndiratta
Devpriya Soni
机构
[1] JaypeeInstituteofInformationTechnology
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Purpose: Ever increasing penetration of the Internet in our lives has led to an enormous amount of multimedia content generation on the internet. Textual data contributes a major share towards data generated on the world wide web. Understanding people's sentiment is an important aspect of natural language processing, but this opinion can be biased and incorrect, if people use sarcasm while commenting, posting status updates or reviewing any product or a movie. Thus, it is of utmost importance to detect sarcasm correctly and make a correct prediction about the people's intentions.Design/methodology/approach: This study tries to evaluate various machine learning models along with standard and hybrid deep learning models across various standardized datasets. We have performed vectorization of text using word embedding techniques. This has been done to convert the textual data into vectors for analytical purposes. We have used three standardized datasets available in public domain and used three word embeddings i.e Word2 Vec, GloVe and fastText to validate the hypothesis.Findings: The results were analyzed and conclusions are drawn. The key finding is: the hybrid models that include Bidirectional LongTerm Short Memory(Bi-LSTM) and Convolutional Neural Network(CNN) outperform others conventional machine learning as well as deep learning models across all the datasets considered in this study, making our hypothesis valid.Research limitations: Using the data from different sources and customizing the models according to each dataset, slightly decreases the usability of the technique. But, overall this methodology provides effective measures to identify the presence of sarcasm with a minimum average accuracy of 80% or above for one dataset and better than the current baseline results for the other datasets.Practical implications: The results provide solid insights for the system developers to integrate this model into real-time analysis of any review or comment posted in the public domain. This study has various other practical implications for businesses that depend on user ratings and public opinions. This study also provides a launching platform for various researchers to work on the problem of sarcasm identification in textual data.Originality/value: This is a first of its kind study, to provide us the difference between conventional and the hybrid methods of prediction of sarcasm in textual data. The study also provides possible indicators that hybrid models are better when applied to textual data for analysis of sarcasm.
引用
收藏
页数:28
相关论文
共 50 条
  • [21] Automatic identification of sarcasm in tweets and customer reviews
    Naz, Farah
    Kamran, Muhammad
    Mehmood, Waclar
    Khan, Wilayat
    Alkatheiri, Mohammed Saeed
    Alghamdi, Ahmed S.
    Alshdadi, Abdulrahman A.
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 37 (05) : 6815 - 6828
  • [22] Sarcasm Identification on Twitter: A Machine Learning Approach
    Onan, Aytug
    [J]. ARTIFICIAL INTELLIGENCE TRENDS IN INTELLIGENT SYSTEMS, CSOC2017, VOL 1, 2017, 573 : 374 - 383
  • [23] The Sarchasm: Sarcasm Production and Identification in Spontaneous Conversation
    Fox Tree, Jean E.
    D'Arcey, J. Trevor
    Hammond, Alicia A.
    Larson, Alina S.
    [J]. DISCOURSE PROCESSES, 2020, 57 (5-6) : 507 - 533
  • [24] Cross-cultural nuances in sarcasm comprehension: a comparative study of Chinese and American perspectives
    Du, Yiran
    He, Huimin
    Chu, Zihan
    [J]. FRONTIERS IN PSYCHOLOGY, 2024, 15
  • [25] Novel Fuzzy System Identification: Comparative Study and Application for Data Forecasting
    Martins, Jefferson Beethoven
    Bertone, Ana Maria A.
    Yamanaka, Keiji
    [J]. IEEE LATIN AMERICA TRANSACTIONS, 2019, 17 (11) : 1793 - 1799
  • [26] Integration of microarray data for a comparative study of classifiers and identification of marker genes
    Berrar, D
    Sturgeon, B
    Bradbury, I
    Downes, CS
    Dubitzky, W
    [J]. METHODS OF MICROARRAY DATA ANALYSIS IV, 2005, : 147 - 162
  • [27] Comparative textual linguistics
    Moro, AL
    [J]. QUADERNI D ITALIANISTICA, 2000, 21 (01): : 157 - 160
  • [28] CHILDREN AND SARCASM - A PSYCHOLINGUISTIC STUDY
    EATON, R
    [J]. JOURNAL OF LITERARY SEMANTICS, 1988, 17 (02) : 122 - 148
  • [29] Identification and Prediction of Human Behavior through Mining of Unstructured Textual Data
    Davahli, Mohammad Reza
    Karwowski, Waldemar
    Gutierrez, Edgar
    Fiok, Krzysztof
    Wrobel, Grzegorz
    Taiar, Redha
    Ahram, Tareq
    [J]. SYMMETRY-BASEL, 2020, 12 (11): : 1 - 23
  • [30] A comparison of textual data mining methods for sex identification in chat conversations
    Kose, Cemal
    Ozyurt, Ozcan
    Ikibas, Cevat
    [J]. INFORMATION RETRIEVAL TECHNOLOGY, 2008, 4993 : 638 - 643