Sentiment Analysis using Unlabeled Email data

被引:0
|
作者
Ali, Rayan Salah Hag [1 ]
El Gayar, Neamat [1 ]
机构
[1] Heriot Watt Univ, Sch Math & Comp Sci, Dubai, U Arab Emirates
关键词
Sentiment analysis; k-means; TFIDF; support vector machine;
D O I
10.1109/iccike47802.2019.9004372
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentiment Analysis (SA) in the context of text mining is an automated process to detect subjectivity information, such as opinions, attitudes, emotions and feeling. Most prior work in SA view it as a text classification problem which needs labeled data to train the model. However, it is tough to get a labeled dataset. Most of the times we will need to do it by hand. Another issue is that the lack of portability across different domains makes it hard to use the same labeled data in different applications. Thus, we need to create labeled data for each domain manually. In this paper, we will use sentiment analysis to analyze the Enron email dataset. This work aims to find the best techniques to label the dataset automatically and avoid manual labeling. The training data is used to build a classifier using a supervised machine learning algorithm. In the labeling phase, we compare the lexicon labeling with k- mean labeling. Lexicon labeling gave better and reliable results. We used this labeled dataset to train the classifier. We used TF-IDF for feature extraction, to train Naive Bayes and Support vector machine (SVM) classifiers.
引用
收藏
页码:329 / 334
页数:6
相关论文
共 50 条
  • [41] Sentiment Analysis of Top Colleges in India Using Twitter Data
    Mamgain, Nehal
    Mehta, Ekta
    Mittal, Ankush
    Bhatt, Gaurav
    2016 INTERNATIONAL CONFERENCE ON COMPUTATIONAL TECHNIQUES IN INFORMATION AND COMMUNICATION TECHNOLOGIES (ICCTICT), 2016,
  • [42] Exploiting Data of the Twitter Social Network Using Sentiment Analysis
    Gonzalez-Marron, David
    Mejia-Guzman, David
    Enciso-Gonzalez, Angelica
    APPLICATIONS FOR FUTURE INTERNET, AFI 2016, 2017, 179 : 35 - 38
  • [43] Sentiment Analysis of Real Time Twitter data using Big data Approach
    Rodrigues, Anisha P.
    Rao, Archana
    Chiplunkar, Niranjan N.
    2017 2ND INTERNATIONAL CONFERENCE ON COMPUTATIONAL SYSTEMS AND INFORMATION TECHNOLOGY FOR SUSTAINABLE SOLUTION (CSITSS-2017), 2017, : 175 - 180
  • [44] Sentiment analysis in twitter data using data analytic techniques for predictive modelling
    Sulthana, A. Razia
    Jaithunbi, A. K.
    Ramesh, L. Sai
    PROCEEDINGS OF THE 10TH NATIONAL CONFERENCE ON MATHEMATICAL TECHNIQUES AND ITS APPLICATIONS (NCMTA 18), 2018, 1000
  • [45] VGI and crowdsourced data credibility analysis using spam email detection techniques
    Koswatte, Saman
    McDougall, Kevin
    Liu, Xiaoye
    INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2018, 11 (05) : 520 - 532
  • [46] Enhancing Labeled Data Using Unlabeled Data for Topic Tracking
    Fukumoto, Fumiyo
    Suzuki, Yoshimi
    Yamamoto, Takeshi
    HUMAN LANGUAGE TECHNOLOGY CHALLENGES FOR COMPUTER SCIENCE AND LINGUISTICS, 2014, 8387 : 201 - 212
  • [47] Sentiment Analysis on Weibo Data
    Li, Di
    Niu, Jianwei
    Qiu, Meikang
    Liu, Meiqin
    2014 IEEE COMPUTING, COMMUNICATIONS AND IT APPLICATIONS CONFERENCE (COMCOMAP), 2014, : 249 - 254
  • [48] Sentiment Analysis of Customer Data
    Grljevic, Olivera
    Bosnjak, Zita
    STRATEGIC MANAGEMENT, 2018, 23 (03): : 38 - 49
  • [49] Sentiment analysis of customer data
    Tarnowska, Katarzyna A.
    Ras, Zbigniew W.
    WEB INTELLIGENCE, 2019, 17 (04) : 343 - 363
  • [50] Sentiment Analysis of Twitter Data
    Desai, Radhi D.
    PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2018, : 114 - 117