Sentiment analysis using semantic similarity and Hadoop MapReduce

被引:18
|
作者
Madani, Youness [1 ]
Erritali, Mohammed [1 ]
Bengourram, Jamaa [1 ]
机构
[1] Sultan Moulay Slimane Univ, Dept Comp Sci, Fac Sci & Tech, Beni Mellal, Morocco
关键词
Opinion mining; Sentiment analysis; Semantic similarity; WordNet; Big data; Hadoop;
D O I
10.1007/s10115-018-1212-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentiment analysis or opinion mining is a domain that analyses people's opinions, sentiments, evaluations, attitudes, and emotions from a written language; it had become a very active area of scientific research in recent years, especially with the development of social networks like Facebook and Twitter. In this paper we propose two new approaches to classify the tweets (look for the feeling expressed in the tweet), the first according to three classes : negative, positive or neutral, and the second according to two classes : negative or positive. Our first method consists in calculating the semantic similarity between the tweet to classify and three documents where each document represents a class (contains the words that represent a class); after the calculation of the similarity, the tweet takes the class of the document that has the greatest value of the semantic similarity with it. And the second method consists in calculating the semantic similarity between each word of the tweet to classify and the words positive and negative by proposing a new formula. We decide to do the analysis in a parallel and distributed way, using the Hadoop framework with the Hadoop distributed file system (HDFS) and the programming model MapReduce to solve the problem of the calculation time of the analysis if the dataset of the tweets is very large. The aim of our work is to combine between several domains, the information retrieval, semantic similarity, opinion mining or sentiment analysis and big data.
引用
收藏
页码:413 / 436
页数:24
相关论文
共 50 条
  • [1] Sentiment analysis using semantic similarity and Hadoop MapReduce
    Youness Madani
    Mohammed Erritali
    Jamaa Bengourram
    [J]. Knowledge and Information Systems, 2019, 59 : 413 - 436
  • [2] Data Analysis using Hadoop MapReduce Environment
    Merla, PrathyushaRani
    Liang, Yiheng
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 4783 - 4785
  • [3] Sentiment Analysis Using Machine Learning and Deep Learning on Covid 19 Vaccine Twitter Data with Hadoop MapReduce
    Kul, Seda
    Sayar, Ahmet
    [J]. 6TH INTERNATIONAL CONFERENCE ON SMART CITY APPLICATIONS, 2022, 393 : 859 - 868
  • [4] An approach for MapReduce based Log analysis using Hadoop
    Hingave, Hemant
    Ingle, Rasika
    [J]. 2015 2ND INTERNATIONAL CONFERENCE ON ELECTRONICS AND COMMUNICATION SYSTEMS (ICECS), 2015, : 1264 - 1268
  • [5] MapReduce Based Analysis of Sample Applications Using Hadoop
    Ghazi, Mohd Rehan
    Raghava, N. S.
    [J]. APPLICATIONS OF COMPUTING AND COMMUNICATION TECHNOLOGIES, ICACCT 2018, 2018, 899 : 34 - 44
  • [6] Semantic Similarity and Sentiment Analysis of Short Texts in Serbian
    Batanovic, Vuk
    [J]. 2021 29TH TELECOMMUNICATIONS FORUM (TELFOR), 2021,
  • [7] Combining Knowledge Graphs with Semantic Similarity Metrics for Sentiment Analysis
    Swedrak, Piotr
    Adrian, Weronika T.
    Kluza, Krzysztof
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT I, 2022, 13368 : 489 - 501
  • [8] Analyzing Adverbs Impact for Sentiment analysis using Hadoop
    Zafar, Lubna
    Ahmed, Ibrar
    Aleem, Muhammad
    Islam, Muhammad Arshad
    Iqbal, Muhammad Azhar
    [J]. 2017 13TH INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES (ICET 2017), 2017,
  • [9] SmartGrids: MapReduce Framework using Hadoop
    Fanibhare, Vaibhav
    Dahake, Vijay
    [J]. 2016 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2016, : 406 - 411
  • [10] Twitter Sentiment Analysis in Healthcare using Hadoop and R
    Gupta, Vijay Shankar
    Kohli, Shruti
    [J]. PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 3766 - 3772