HiSAT: Hierarchical Framework for Sentiment Analysis on Twitter Data

被引：1

作者：

Kommu, Amrutha ^{[1
]}

Patel, Snehal ^{[1
]}

Derosa, Sebastian ^{[1
]}

Wang, Jiayin ^{[1
]}

Varde, Aparna S. ^{[1
]}

机构：

[1] Montclair State Univ, Montclair, NJ 07043 USA

来源：

INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 1 | 2023年 / 542卷

基金：

美国国家科学基金会;

关键词：

Bayesian models; Knowledge discovery; Logistic Regression; NLP; Opinion mining; Random Forest; Social media; Text mining; EMOTION RECOGNITION FEATURES;

D O I：

10.1007/978-3-031-16072-1_28

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Social media websites such as Twitter have become so indispensable today that people use them almost on a daily basis for sharing their emotions, opinions, suggestions and thoughts. Motivated by such behavioral tendencies, the purpose of this study is to define an approach to automatically classify the tweets on Twitter data into two main classes, namely, hate speech and non-hate speech. This provides a valuable source of information in analyzing and understanding target audiences and spotting marketing trends. We thus propose HiSAT, a Hierarchical framework for Sentiment Analysis on Twitter data. Sentiments/opinions in tweets are highly unstructured-and do not have a proper defined sequence. They constitute a heterogeneous data from many sources having different formats, and express either positive or negative, or neutral sentiment. Hence, in HiSAT we conduct Natural Language Processing encompassing tokenization, stemming and lemmatization techniques that convert text to tokens; as well as Bag-of-Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF) techniques that convert text sentences into numeric vectors. These are then fed as inputs to Machine learning algorithms within the HiSAT framework; more specifically, Random Forest, Logistic Regression and Naive Bayes are used as text-binary classifiers to detect hate speech and non-hate speech from the tweets. Results of experiments performed with the HiSAT framework show that Random Forest outperforms the others with a better prediction in estimating the correct labels (with accuracy above the 95% range). We present the HiSAT approach, its implementation and experiments, along with related work and ongoing research.

引用

页码：376 / 392

页数：17

共 50 条

[21] A Review of Techniques for Sentiment Analysis Of Twitter Data
Bhuta, Sagar
Doshi, Avit
Doshi, Uehit
Narvekar, Meera
[J]. PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON ISSUES AND CHALLENGES IN INTELLIGENT COMPUTING TECHNIQUES (ICICT), 2014, : 583 - 591
[22] Interdisciplinary optimism? Sentiment analysis of Twitter data
Weber, Charlotte Teresa
Syed, Shaheen
[J]. ROYAL SOCIETY OPEN SCIENCE, 2019, 6 (07):
[23] A study on sentiment analysis techniques of Twitter data
Alsaeedi, Abdullah
Khan, Mohammad Zubair
[J]. International Journal of Advanced Computer Science and Applications, 2019, 10 (02): : 361 - 374
[24] Event Based Sentiment Analysis of Twitter Data
Patil, Mamta
Chavan, H. K.
[J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC 2018), 2018, : 1041 - 1054
[25] Bat Inspired Sentiment Analysis of Twitter Data
Khurana, Himja
Sahu, Sanjib Kumar
[J]. PROGRESS IN ADVANCED COMPUTING AND INTELLIGENT ENGINEERING, VOL 2, 2018, 564 : 639 - 650
[26] Sentiment mapping: point pattern analysis of sentiment classified Twitter data
Camacho, Ken
Portelli, Raechel
Shortridge, Ashton
Takahashi, Bruno
[J]. CARTOGRAPHY AND GEOGRAPHIC INFORMATION SCIENCE, 2021, 48 (03) : 241 - 257
[27] Sentiment Analysis on COVID-19 Twitter Data: A Sentiment Timeline
Karagkiozidou, Makrina
Koukaras, Paraskevas
Tjortjis, Christos
[J]. ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2022, PART II, 2022, 647 : 350 - 359
[28] A Framework for Sentiment Analysis Implementation of Indonesian Language Tweet on Twitter
Asniar
Aditya, B. R.
[J]. 1ST INTERNATIONAL CONFERENCE ON COMPUTING AND APPLIED INFORMATICS 2016 : APPLIED INFORMATICS TOWARD SMART ENVIRONMENT, PEOPLE, AND SOCIETY, 2017, 801
[29] On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of Twitter
Saif, Hassan
Fernandez, Miriam
He, Yulan
Alani, Harith
[J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 810 - 817
[30] Sentiment Analysis on Automobile Brands Using Twitter Data
Asghar, Zain
Ali, Tahir
Ahmad, Imran
Tharanidharan, Sridevi
Nazar, Shamim Kamal Abdul
Kamal, Shahid
[J]. INTELLIGENT TECHNOLOGIES AND APPLICATIONS, INTAP 2018, 2019, 932 : 76 - 85

← 1 2 3 4 5 →