Sentiment Analysis using Unlabeled Email data

被引:0
|
作者
Ali, Rayan Salah Hag [1 ]
El Gayar, Neamat [1 ]
机构
[1] Heriot Watt Univ, Sch Math & Comp Sci, Dubai, U Arab Emirates
关键词
Sentiment analysis; k-means; TFIDF; support vector machine;
D O I
10.1109/iccike47802.2019.9004372
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentiment Analysis (SA) in the context of text mining is an automated process to detect subjectivity information, such as opinions, attitudes, emotions and feeling. Most prior work in SA view it as a text classification problem which needs labeled data to train the model. However, it is tough to get a labeled dataset. Most of the times we will need to do it by hand. Another issue is that the lack of portability across different domains makes it hard to use the same labeled data in different applications. Thus, we need to create labeled data for each domain manually. In this paper, we will use sentiment analysis to analyze the Enron email dataset. This work aims to find the best techniques to label the dataset automatically and avoid manual labeling. The training data is used to build a classifier using a supervised machine learning algorithm. In the labeling phase, we compare the lexicon labeling with k- mean labeling. Lexicon labeling gave better and reliable results. We used this labeled dataset to train the classifier. We used TF-IDF for feature extraction, to train Naive Bayes and Support vector machine (SVM) classifiers.
引用
收藏
页码:329 / 334
页数:6
相关论文
共 50 条
  • [1] A Hybrid Sentiment Analysis Framework for Large Email Data
    Liu, Sisi
    Lee, Ickjai
    2015 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND KNOWLEDGE ENGINEERING (ISKE), 2015, : 324 - 330
  • [2] Sentiment analysis on labeled and unlabeled datasets using BERT architecture
    Chakraborty, Koyel
    Bhattacharyya, Siddhartha
    Bag, Rajib
    Mrsic, Leo
    SOFT COMPUTING, 2023, 28 (15-16) : 8623 - 8640
  • [3] Pretraining Sentiment Classifiers with Unlabeled Dialog Data
    Shimizu, Toru
    Kobayashi, Hayato
    Shimizu, Nobuyuki
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 764 - 770
  • [4] Sentiment analysis from email pattern using feature selection algorithm
    Srinivasarao, Ulligaddala
    Sharaff, Aakanksha
    EXPERT SYSTEMS, 2024, 41 (02)
  • [5] Take full advantage of unlabeled data for sentiment classification
    La, Lei
    Cao, Shuyan
    Qin, Liangjuan
    KYBERNETES, 2018, 47 (03) : 474 - 486
  • [6] Sentiment Analysis for Automated Email Response System
    Abbas, Muhammad R.
    Khan, Mukarram
    2019 INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGIES (COMTECH), 2019, : 65 - 70
  • [7] Novel email spam detection method using sentiment analysis and personality recognition
    Ezpeleta, Enaitz
    Velez de Mendizabal, Inaki
    Gomez Hidalgo, Jose Maria
    Zurutuza, Urko
    LOGIC JOURNAL OF THE IGPL, 2020, 28 (01) : 83 - 94
  • [8] SENTIMENT ANALYSIS USING BIG DATA
    Ramanujam, R. Suresh
    Nancyamala, R.
    Nivedha, J.
    Kokila, J.
    2015 INTERNATIONAL CONFERENCE ON COMPUTATION OF POWER, ENERGY, INFORMATION AND COMMUNICATION (ICCPEIC), 2015, : 480 - 484
  • [9] Aspect Based Sentiment Analysis of Unlabeled Reviews Using Linguistic Rule Based LDA
    Pathik, Nikhlesh
    Shukla, Pragya
    JOURNAL OF CASES ON INFORMATION TECHNOLOGY, 2022, 24 (03)
  • [10] Development of Sentiment Indicators Using both Unlabeled and Labeled Posts
    Ito, Tomoki
    Sakaji, Hiroki
    Izumi, Kiyoshi
    Tsubouchi, Kota
    Yamashita, Tatsuo
    2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, : 314 - 321