HiSAT: Hierarchical Framework for Sentiment Analysis on Twitter Data

被引:1
|
作者
Kommu, Amrutha [1 ]
Patel, Snehal [1 ]
Derosa, Sebastian [1 ]
Wang, Jiayin [1 ]
Varde, Aparna S. [1 ]
机构
[1] Montclair State Univ, Montclair, NJ 07043 USA
基金
美国国家科学基金会;
关键词
Bayesian models; Knowledge discovery; Logistic Regression; NLP; Opinion mining; Random Forest; Social media; Text mining; EMOTION RECOGNITION FEATURES;
D O I
10.1007/978-3-031-16072-1_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Social media websites such as Twitter have become so indispensable today that people use them almost on a daily basis for sharing their emotions, opinions, suggestions and thoughts. Motivated by such behavioral tendencies, the purpose of this study is to define an approach to automatically classify the tweets on Twitter data into two main classes, namely, hate speech and non-hate speech. This provides a valuable source of information in analyzing and understanding target audiences and spotting marketing trends. We thus propose HiSAT, a Hierarchical framework for Sentiment Analysis on Twitter data. Sentiments/opinions in tweets are highly unstructured-and do not have a proper defined sequence. They constitute a heterogeneous data from many sources having different formats, and express either positive or negative, or neutral sentiment. Hence, in HiSAT we conduct Natural Language Processing encompassing tokenization, stemming and lemmatization techniques that convert text to tokens; as well as Bag-of-Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF) techniques that convert text sentences into numeric vectors. These are then fed as inputs to Machine learning algorithms within the HiSAT framework; more specifically, Random Forest, Logistic Regression and Naive Bayes are used as text-binary classifiers to detect hate speech and non-hate speech from the tweets. Results of experiments performed with the HiSAT framework show that Random Forest outperforms the others with a better prediction in estimating the correct labels (with accuracy above the 95% range). We present the HiSAT approach, its implementation and experiments, along with related work and ongoing research.
引用
收藏
页码:376 / 392
页数:17
相关论文
共 50 条
  • [1] Sentiment Analysis Framework of Twitter Data Using Classification
    Khurana, Medha
    Gulati, Anurag
    Singh, Saurabh
    [J]. 2018 FIFTH INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (IEEE PDGC), 2018, : 459 - 464
  • [2] Sentiment Analysis on Twitter Data using Apache Spark Framework
    Elzayady, Hossam
    Badran, Khaled M.
    Salama, Gouda I.
    [J]. PROCEEDINGS OF 2018 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND SYSTEMS (ICCES), 2018, : 171 - 176
  • [3] Sentiment Analysis of Twitter Data
    Desai, Radhi D.
    [J]. PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2018, : 114 - 117
  • [4] Mapreduce framework based sentiment analysis of twitter data using hierarchical attention network with chronological leader algorithm
    Jagdale, Jayashree
    Sreemathy, R.
    Jagdale, Balaso
    Ghag, Kranti
    [J]. SOCIAL NETWORK ANALYSIS AND MINING, 2024, 14 (01)
  • [5] Sentiment Analysis of Twitter Data
    Wang, Yili
    Guo, Jiaxuan
    Yuan, Chengsheng
    Li, Baozhu
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (22):
  • [6] Sentiment Analysis of Twitter Data
    El Rahman, Sahar A.
    AlOtaibi, Feddah Alhumaidi
    AlShehri, Wejdan Abdullah
    [J]. 2019 INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCES (ICCIS), 2019, : 336 - 339
  • [7] Sentiment Analysis of Big Data Applications using Twitter Data with the Help of HADOOP Framework
    Sehgal, Divya
    Agarwal, Ambuj Kumar
    [J]. PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON SYSTEM MODELING & ADVANCEMENT IN RESEARCH TRENDS (SMART-2016), 2016, : 251 - 255
  • [8] Clustering and Sentiment Analysis on Twitter Data
    Ahuja, Shreya
    Dubey, Gaurav
    [J]. 2017 2ND INTERNATIONAL CONFERENCE ON TELECOMMUNICATION AND NETWORKS (TEL-NET), 2017, : 420 - 424
  • [9] Sentiment Analysis of Turkish Twitter Data
    Shehu, Harisu Abdullahi
    Tokat, Sezai
    Sharif, Md. Haidar
    Uyaver, Sahin
    [J]. THIRD INTERNATIONAL CONFERENCE OF MATHEMATICAL SCIENCES (ICMS 2019), 2019, 2183
  • [10] Sentiment analysis of multimodal twitter data
    Kumar, Akshi
    Garg, Geetanjali
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (17) : 24103 - 24119