Improving Sentiment Classification Accuracy of Financial News using N-gram Approach and Feature Weighting Methods

被引:0
|
作者
Foroozan, S. [1 ]
Murad, M. A. Azmi [1 ]
Sharef, N. M. [1 ]
Latiff, A. R. Abdul [2 ]
机构
[1] Univ Putra Malaysia, Fac Comp Sci & IT, Serdang 43400, Malaysia
[2] Univ Putra Malaysia, Putra Business Sch, Serdang 43400, Malaysia
关键词
sentiment classification; financial news; support vector machine; Radial Basis Function; linear kernel; document frequency; TF-IDF;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sentiment classification of financial news deals with the identification of positive and negative news so that they can be applied in decision support system to perform stock trend predictions. This paper explores several types of feature space as different datasets for sentiment classification of the news article. Experiments are conducted based on n-gram approach (unigram, bigram and the combination of unigram and bigram) used as feature extraction with different feature weighting methods, while, document frequency (DF) is used as feature selection method. We performed experiments to measure the classification accuracy of support vector machine (SVM) with two kernel methods of linear and Radial Basis Function (RBF). Results showed that an efficient feature extraction increased classification accuracy when it is used as a combination of unigram and bigram. Moreover, we also found that DF can be applied as a dimension reduction method to reduce the feature space without loss of accuracy.
引用
收藏
页码:211 / 214
页数:4
相关论文
共 50 条
  • [21] Instructor-assisted question classification system using machine learning algorithms with N-gram and weighting schemes
    Dake D.K.
    Nwiah E.
    Klogo G.S.
    Ativi W.X.
    [J]. Discover Artificial Intelligence, 2023, 3 (01):
  • [22] A machine learning approach for Arabic text classification using N-gram frequency statistics
    Khreisat, Laila
    [J]. JOURNAL OF INFORMETRICS, 2009, 3 (01) : 72 - 77
  • [23] The Optimization of n-Gram Feature Extraction Based on Term Occurrence for Cyberbullying Classification
    Setiawan, Yudi
    Maulidevi, Nur Ulfa
    Surendro, Kridanto
    [J]. Data Science Journal, 2024, 23 (01)
  • [24] Sparse Coding for N-Gram Feature Extraction and Training for File Fragment Classification
    Wang, Felix
    Quach, Tu-Thach
    Wheeler, Jason
    Aimone, James B.
    James, Conrad D.
    [J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2018, 13 (10) : 2553 - 2562
  • [25] Content Based Fake News Detection Using N-Gram Models
    Wynne, Hnin Ei
    Wint, Zar Zar
    [J]. IIWAS2019: THE 21ST INTERNATIONAL CONFERENCE ON INFORMATION INTEGRATION AND WEB-BASED APPLICATIONS & SERVICES, 2019, : 669 - 673
  • [26] Sentiment analysis of financial news using unsupervised approach
    Yadav, Anita
    Jha, C. K.
    Sharan, Aditi
    Vaish, Vikrant
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND DATA SCIENCE, 2020, 167 : 589 - 598
  • [27] Automatic Chinese Text Classification Using N-Gram Model
    Yen, Show-Jane
    Lee, Yue-Shi
    Wu, Yu-Chieh
    Ying, Jia-Ching
    Tseng, Vincent S.
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2010, PT 3, PROCEEDINGS, 2010, 6018 : 458 - +
  • [28] Protein Classification Using N-gram Technique and Association Rules
    Kabli, Fatima
    Hamou, Reda Mohamed
    Amine, Abdelmalek
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE INNOVATION, 2018, 6 (02) : 77 - 89
  • [29] Document classification using n-gram and word semantic similarity
    Ren, Mei-Ying
    Kang, Sinjae
    [J]. International Journal of Future Generation Communication and Networking, 2015, 8 (08): : 111 - 118
  • [30] Bug or Not? Bug Report Classification using N-Gram IDF
    Terdchanakul, Pannavat
    Hata, Hideaki
    Phannachitta, Passakorn
    Matsumoto, Kenichi
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME), 2017, : 534 - 538