Sarcasm identification in textual data: systematic review, research challenges and open directions

被引:43
|
作者
Eke, Christopher Ifeanyi [1 ,2 ]
Norman, Azah Anir [1 ]
Shuib, Liyana [1 ]
Nweke, Henry Friday [1 ,3 ]
机构
[1] Univ Malaya, Dept Informat Syst, Fac Comp Sci & Informat Technol, Kuala Lumpur 50603, Malaysia
[2] Fed Univ, Fac Sci, Dept Comp Sci, PMB 146, Lafia, Nasarawa State, Nigeria
[3] Ebonyi State Univ, Comp Sci Dept, PMB 053, Abakaliki, Ebonyi State, Nigeria
关键词
Sarcasm identification; Social media data; Natural language processing; Pre-processing; Feature engineering; Textual classification; Performance measure; SOCIAL MEDIA; CLASSIFICATION; SELECTION; TWEETS;
D O I
10.1007/s10462-019-09791-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sarcasm is a form of sentiment whereby people express the implicit information, usually the opposite of the message content in order to hurt someone emotionally or criticise something in a humorous way. Sarcasm identification in textual data, being one of the hardest challenges in natural language processing (NLP), has recently become an interesting research area due to its importance in improving the sentiment analysis of social media data. A few studies have carried out a comprehensive literature review on sarcasm identification in the existing primary study within the last 11 years. Thus, this study carried out a review on the classification techniques for sarcasm identification under the aspects of datasets, pre-processing, feature engineering, classification algorithms, and performance metrics. The study has considered the published article from the period of 2008 to 2019. Forty (40) academic literature were selected from the 7 standard academic databases in order to carry out the review and realize the objectives. The study revealed that most researchers created their own datasets since there is no standard available datasets in the domain of sarcasm identification. Context and content-based linguistic features were used in most of the studies. This review shows that n-gram and parts of speech tagging techniques were the most commonly used feature extraction techniques. However, binary representation and term frequency were utilized for feature representation whereas Chi squared and information gain were used for the feature selection scheme. Moreover, classification algorithm such as support vector machine, Naive Bayes, random forest, maximum entropy, and decision tree algorithm were mostly applied using accuracy, precision, recall and F-measure for performance measures. Finally, research challenges and future direction are summarized in this review. This review reveals the impact of sarcasm identification in building effective product reviews and would serve as handle resources for researchers and practitioners in sarcasm identification and text classification in general.
引用
收藏
页码:4215 / 4258
页数:44
相关论文
共 50 条
  • [1] Sarcasm identification in textual data: systematic review, research challenges and open directions
    Christopher Ifeanyi Eke
    Azah Anir Norman
    Henry Friday Liyana Shuib
    [J]. Artificial Intelligence Review, 2020, 53 : 4215 - 4258
  • [2] Identification of Sarcasm in Textual Data: A Comparative Study
    Mehndiratta, Pulkit
    Soni, Devpriya
    [J]. JOURNAL OF DATA AND INFORMATION SCIENCE, 2019, 4 (04) : 56 - 83
  • [3] Identification of Sarcasm in Textual Data: A Comparative Study
    Pulkit Mehndiratta
    Devpriya Soni
    [J]. Journal of Data and Information Science, 2019, (04) : 56 - 83
  • [4] Identification of Sarcasm in Textual Data: A Comparative Study
    Pulkit Mehndiratta
    Devpriya Soni
    [J]. Journal of Data and Information Science., 2019, 4 (04) - 83
  • [5] A systematic literature review of hate speech identification on Arabic Twitter data: research challenges and future directions
    Alhazmi, Ali
    Mahmud, Rohana
    Idris, Norisma
    Abo, Mohamed Elhag Mohamed
    Eke, Christopher
    [J]. PEERJ COMPUTER SCIENCE, 2024, 10
  • [6] Deep learning techniques for solar tracking systems: A systematic literature review, research challenges, and open research directions
    Phiri, Musa
    Mulenga, Mwenge
    Zimba, Aaron
    Eke, Christopher Ifeanyi
    [J]. SOLAR ENERGY, 2023, 262
  • [7] A Systematic Literature Review of Open Government Data Research: Challenges, Opportunities and Gaps
    Hassan, Manal Ibrahim Ali
    Twinomurinzi, Hossana
    [J]. 2018 OPEN INNOVATIONS CONFERENCE (OI), 2018, : 299 - 304
  • [8] Deep learning: systematic review, models, challenges, and research directions
    Talaei Khoei, Tala
    Ould Slimane, Hadjar
    Kaabouch, Naima
    [J]. NEURAL COMPUTING & APPLICATIONS, 2023, 35 (31): : 23103 - 23124
  • [9] Progress in Multivariate Cryptography: Systematic Review, Challenges, and Research Directions
    Dey, Jayashree
    Dutta, Ratna
    [J]. ACM COMPUTING SURVEYS, 2023, 55 (12)
  • [10] Deep learning: systematic review, models, challenges, and research directions
    Tala Talaei Khoei
    Hadjar Ould Slimane
    Naima Kaabouch
    [J]. Neural Computing and Applications, 2023, 35 : 23103 - 23124