HiSAT: Hierarchical Framework for Sentiment Analysis on Twitter Data

被引:1
|
作者
Kommu, Amrutha [1 ]
Patel, Snehal [1 ]
Derosa, Sebastian [1 ]
Wang, Jiayin [1 ]
Varde, Aparna S. [1 ]
机构
[1] Montclair State Univ, Montclair, NJ 07043 USA
基金
美国国家科学基金会;
关键词
Bayesian models; Knowledge discovery; Logistic Regression; NLP; Opinion mining; Random Forest; Social media; Text mining; EMOTION RECOGNITION FEATURES;
D O I
10.1007/978-3-031-16072-1_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Social media websites such as Twitter have become so indispensable today that people use them almost on a daily basis for sharing their emotions, opinions, suggestions and thoughts. Motivated by such behavioral tendencies, the purpose of this study is to define an approach to automatically classify the tweets on Twitter data into two main classes, namely, hate speech and non-hate speech. This provides a valuable source of information in analyzing and understanding target audiences and spotting marketing trends. We thus propose HiSAT, a Hierarchical framework for Sentiment Analysis on Twitter data. Sentiments/opinions in tweets are highly unstructured-and do not have a proper defined sequence. They constitute a heterogeneous data from many sources having different formats, and express either positive or negative, or neutral sentiment. Hence, in HiSAT we conduct Natural Language Processing encompassing tokenization, stemming and lemmatization techniques that convert text to tokens; as well as Bag-of-Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF) techniques that convert text sentences into numeric vectors. These are then fed as inputs to Machine learning algorithms within the HiSAT framework; more specifically, Random Forest, Logistic Regression and Naive Bayes are used as text-binary classifiers to detect hate speech and non-hate speech from the tweets. Results of experiments performed with the HiSAT framework show that Random Forest outperforms the others with a better prediction in estimating the correct labels (with accuracy above the 95% range). We present the HiSAT approach, its implementation and experiments, along with related work and ongoing research.
引用
收藏
页码:376 / 392
页数:17
相关论文
共 50 条
  • [31] Sentiment Analysis on COVID-19 Twitter Data
    Vijay, Tanmay
    Chawla, Ayan
    Dhanka, Balan
    Karmakar, Purnendu
    [J]. 2020 5TH IEEE INTERNATIONAL CONFERENCE ON RECENT ADVANCES AND INNOVATIONS IN ENGINEERING (IEEE - ICRAIE-2020), 2020,
  • [32] Collection and Sentiment Analysis of Twitter Data on the Political Atmosphere
    Cisija, Merima
    Zunic, Emir
    Donko, Dzenana
    [J]. 2018 14TH SYMPOSIUM ON NEURAL NETWORKS AND APPLICATIONS (NEUREL), 2018,
  • [33] Sentiment Analysis of Twitter Data about Blockchain Technology
    Rocha, Rayana Souza
    Saraiva, Lohanna Aires
    de Castro, Angelica Felix
    Silva, Patricio de Alencar
    [J]. PROCEEDINGS OF THE 10TH EURO-AMERICAN CONFERENCE ON TELEMATICS AND INFORMATION SYSTEMS (EATIS 2020), 2020,
  • [34] CORRELATION ANALYSIS OF USER INFLUENCE AND SENTIMENT ON TWITTER DATA
    Hanif, Fadhli Mubarak bin Naina
    Saptawati, G. A. Putri
    [J]. 2014 INTERNATIONAL CONFERENCE ON DATA AND SOFTWARE ENGINEERING (ICODSE), 2014,
  • [35] Sentiment Analysis of Twitter Data based on Ordinal Classification
    Elbagir, Shihab
    Yang, Jing
    [J]. 2018 INTERNATIONAL CONFERENCE ON ALGORITHMS, COMPUTING AND ARTIFICIAL INTELLIGENCE (ACAI 2018), 2018,
  • [36] Sentiment analysis on Twitter data towards climate action
    Rosenberg, Emelie
    Tarazona, Carlota
    Mallor, Fermin
    Eivazi, Hamidreza
    Pastor-Escuredo, David
    Fuso-Nerini, Francesco
    Vinuesa, Ricardo
    [J]. RESULTS IN ENGINEERING, 2023, 19
  • [37] Sentiment Analysis of Twitter Data in Online Social Network
    Dhawan, Sanjeev
    Singh, Kulvinder
    Chauhan, Priyanka
    [J]. PROCEEDINGS OF 2019 5TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMPUTING AND CONTROL (ISPCC 2K19), 2019, : 255 - 259
  • [38] A Topic based Approach for Sentiment Analysis on Twitter Data
    Ficamos, Pierre
    Liu, Yan
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (12) : 201 - 205
  • [39] Opinion Mining and Sentiment Analysis on a Twitter Data Stream
    Gokulakrishnan, Balakrishnan
    Priyanthan, Pavalanathan
    Ragavan, Thiruchittampalam
    Prasath, Nadarajah
    Perera, A. Shehan
    [J]. INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER2012), 2012, : 182 - 188
  • [40] An Apache Spark Implementation for Sentiment Analysis on Twitter Data
    Baltas, Alexandros
    Kanavos, Andreas
    Tsakalidis, Athanasios K.
    [J]. ALGORITHMIC ASPECTS OF CLOUD COMPUTING, ALGOCLOUD 2016, 2017, 10230 : 15 - 25