Sentiment Analysis using Unlabeled Email data

被引:0
|
作者
Ali, Rayan Salah Hag [1 ]
El Gayar, Neamat [1 ]
机构
[1] Heriot Watt Univ, Sch Math & Comp Sci, Dubai, U Arab Emirates
关键词
Sentiment analysis; k-means; TFIDF; support vector machine;
D O I
10.1109/iccike47802.2019.9004372
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentiment Analysis (SA) in the context of text mining is an automated process to detect subjectivity information, such as opinions, attitudes, emotions and feeling. Most prior work in SA view it as a text classification problem which needs labeled data to train the model. However, it is tough to get a labeled dataset. Most of the times we will need to do it by hand. Another issue is that the lack of portability across different domains makes it hard to use the same labeled data in different applications. Thus, we need to create labeled data for each domain manually. In this paper, we will use sentiment analysis to analyze the Enron email dataset. This work aims to find the best techniques to label the dataset automatically and avoid manual labeling. The training data is used to build a classifier using a supervised machine learning algorithm. In the labeling phase, we compare the lexicon labeling with k- mean labeling. Lexicon labeling gave better and reliable results. We used this labeled dataset to train the classifier. We used TF-IDF for feature extraction, to train Naive Bayes and Support vector machine (SVM) classifiers.
引用
收藏
页码:329 / 334
页数:6
相关论文
共 50 条
  • [21] Using Big Data and Sentiment Analysis in Product Evaluation
    Banic, Lada
    Mihanovic, Ana
    Brakus, Marko
    2013 36TH INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2013, : 1149 - 1154
  • [22] Sentiment Analysis Framework of Twitter Data Using Classification
    Khurana, Medha
    Gulati, Anurag
    Singh, Saurabh
    2018 FIFTH INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (IEEE PDGC), 2018, : 459 - 464
  • [23] Sentiment Analysis On Twitter Data Using Distributed Architecture
    Karhan, Zebra
    Soysaldi, Meryem
    Ozben, Yagiz Ozgenc
    Kilic, Erdal
    2018 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2018, : 357 - 360
  • [24] Using Social Networks Data for Behavior and Sentiment Analysis
    Calabrese, Barbara
    Cannataro, Mario
    Ielpo, Nicola
    INTERNET AND DISTRIBUTED COMPUTING SYSTEMS, IDCS 2015, 2015, 9258 : 285 - 293
  • [25] Data Mining through Sentiment Analysis: Lexicon based Sentiment Analysis Model using Aspect Catalogue
    Mehto, Aman
    Indras, Karnika
    2016 SYMPOSIUM ON COLOSSAL DATA ANALYSIS AND NETWORKING (CDAN), 2016,
  • [26] Document-level multi-topic sentiment classification of Email data with BiLSTM and data augmentation
    Liu, Sisi
    Lee, Kyungmi
    Lee, Ickjai
    KNOWLEDGE-BASED SYSTEMS, 2020, 197
  • [27] Evolution and Evaluation: Sarcasm Analysis for Twitter Data Using Sentiment Analysis
    Bhakuni, Monika
    Kumar, Karan
    Iwendi, Celestine
    Singh, Avtar
    JOURNAL OF SENSORS, 2022, 2022
  • [28] Sentiment Analysis using Sentiment Features
    Bahrainian, Seyed-Ali
    Dengel, Andreas
    2013 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY - WORKSHOPS (WI-IAT), VOL 3, 2013, : 26 - 29
  • [29] Feature extractions using labeled and unlabeled data
    Kuo, BC
    Shen, TW
    Chang, CH
    Hung, CC
    IGARSS 2005: IEEE International Geoscience and Remote Sensing Symposium, Vols 1-8, Proceedings, 2005, : 1257 - 1260
  • [30] Logic Relation Refinement Using Unlabeled Data
    Chan, Ki
    Wong, Tak-Lam
    Lam, Wai
    WORLD CONGRESS ON ENGINEERING, WCE 2010, VOL I, 2010, : 5 - 10