HiSAT: Hierarchical Framework for Sentiment Analysis on Twitter Data

被引:1
|
作者
Kommu, Amrutha [1 ]
Patel, Snehal [1 ]
Derosa, Sebastian [1 ]
Wang, Jiayin [1 ]
Varde, Aparna S. [1 ]
机构
[1] Montclair State Univ, Montclair, NJ 07043 USA
基金
美国国家科学基金会;
关键词
Bayesian models; Knowledge discovery; Logistic Regression; NLP; Opinion mining; Random Forest; Social media; Text mining; EMOTION RECOGNITION FEATURES;
D O I
10.1007/978-3-031-16072-1_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Social media websites such as Twitter have become so indispensable today that people use them almost on a daily basis for sharing their emotions, opinions, suggestions and thoughts. Motivated by such behavioral tendencies, the purpose of this study is to define an approach to automatically classify the tweets on Twitter data into two main classes, namely, hate speech and non-hate speech. This provides a valuable source of information in analyzing and understanding target audiences and spotting marketing trends. We thus propose HiSAT, a Hierarchical framework for Sentiment Analysis on Twitter data. Sentiments/opinions in tweets are highly unstructured-and do not have a proper defined sequence. They constitute a heterogeneous data from many sources having different formats, and express either positive or negative, or neutral sentiment. Hence, in HiSAT we conduct Natural Language Processing encompassing tokenization, stemming and lemmatization techniques that convert text to tokens; as well as Bag-of-Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF) techniques that convert text sentences into numeric vectors. These are then fed as inputs to Machine learning algorithms within the HiSAT framework; more specifically, Random Forest, Logistic Regression and Naive Bayes are used as text-binary classifiers to detect hate speech and non-hate speech from the tweets. Results of experiments performed with the HiSAT framework show that Random Forest outperforms the others with a better prediction in estimating the correct labels (with accuracy above the 95% range). We present the HiSAT approach, its implementation and experiments, along with related work and ongoing research.
引用
收藏
页码:376 / 392
页数:17
相关论文
共 50 条
  • [21] A Review of Techniques for Sentiment Analysis Of Twitter Data
    Bhuta, Sagar
    Doshi, Avit
    Doshi, Uehit
    Narvekar, Meera
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON ISSUES AND CHALLENGES IN INTELLIGENT COMPUTING TECHNIQUES (ICICT), 2014, : 583 - 591
  • [22] Interdisciplinary optimism? Sentiment analysis of Twitter data
    Weber, Charlotte Teresa
    Syed, Shaheen
    [J]. ROYAL SOCIETY OPEN SCIENCE, 2019, 6 (07):
  • [23] A study on sentiment analysis techniques of Twitter data
    Alsaeedi, Abdullah
    Khan, Mohammad Zubair
    [J]. International Journal of Advanced Computer Science and Applications, 2019, 10 (02): : 361 - 374
  • [24] Event Based Sentiment Analysis of Twitter Data
    Patil, Mamta
    Chavan, H. K.
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC 2018), 2018, : 1041 - 1054
  • [25] Bat Inspired Sentiment Analysis of Twitter Data
    Khurana, Himja
    Sahu, Sanjib Kumar
    [J]. PROGRESS IN ADVANCED COMPUTING AND INTELLIGENT ENGINEERING, VOL 2, 2018, 564 : 639 - 650
  • [26] Sentiment mapping: point pattern analysis of sentiment classified Twitter data
    Camacho, Ken
    Portelli, Raechel
    Shortridge, Ashton
    Takahashi, Bruno
    [J]. CARTOGRAPHY AND GEOGRAPHIC INFORMATION SCIENCE, 2021, 48 (03) : 241 - 257
  • [27] Sentiment Analysis on COVID-19 Twitter Data: A Sentiment Timeline
    Karagkiozidou, Makrina
    Koukaras, Paraskevas
    Tjortjis, Christos
    [J]. ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2022, PART II, 2022, 647 : 350 - 359
  • [28] A Framework for Sentiment Analysis Implementation of Indonesian Language Tweet on Twitter
    Asniar
    Aditya, B. R.
    [J]. 1ST INTERNATIONAL CONFERENCE ON COMPUTING AND APPLIED INFORMATICS 2016 : APPLIED INFORMATICS TOWARD SMART ENVIRONMENT, PEOPLE, AND SOCIETY, 2017, 801
  • [29] On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of Twitter
    Saif, Hassan
    Fernandez, Miriam
    He, Yulan
    Alani, Harith
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 810 - 817
  • [30] Sentiment Analysis on Automobile Brands Using Twitter Data
    Asghar, Zain
    Ali, Tahir
    Ahmad, Imran
    Tharanidharan, Sridevi
    Nazar, Shamim Kamal Abdul
    Kamal, Shahid
    [J]. INTELLIGENT TECHNOLOGIES AND APPLICATIONS, INTAP 2018, 2019, 932 : 76 - 85