Improving Sentiment Classification Accuracy of Financial News using N-gram Approach and Feature Weighting Methods

被引:0
|
作者
Foroozan, S. [1 ]
Murad, M. A. Azmi [1 ]
Sharef, N. M. [1 ]
Latiff, A. R. Abdul [2 ]
机构
[1] Univ Putra Malaysia, Fac Comp Sci & IT, Serdang 43400, Malaysia
[2] Univ Putra Malaysia, Putra Business Sch, Serdang 43400, Malaysia
关键词
sentiment classification; financial news; support vector machine; Radial Basis Function; linear kernel; document frequency; TF-IDF;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sentiment classification of financial news deals with the identification of positive and negative news so that they can be applied in decision support system to perform stock trend predictions. This paper explores several types of feature space as different datasets for sentiment classification of the news article. Experiments are conducted based on n-gram approach (unigram, bigram and the combination of unigram and bigram) used as feature extraction with different feature weighting methods, while, document frequency (DF) is used as feature selection method. We performed experiments to measure the classification accuracy of support vector machine (SVM) with two kernel methods of linear and Radial Basis Function (RBF). Results showed that an efficient feature extraction increased classification accuracy when it is used as a combination of unigram and bigram. Moreover, we also found that DF can be applied as a dimension reduction method to reduce the feature space without loss of accuracy.
引用
收藏
页码:211 / 214
页数:4
相关论文
共 50 条
  • [31] Syllable n-gram approach for Identification and Classification of genres in Telugu language
    Kumari, K. Pranitha
    Reddy, A. Venugopal
    [J]. 2014 FIRST INTERNATIONAL CONFERENCE ON NETWORKS & SOFT COMPUTING (ICNSC), 2014, : 125 - 129
  • [32] An n-gram based approach to the automatic classification of schoolchildren's writing
    Cicres, Jordi
    Queralt, Sheila
    [J]. VIAL-VIGO INTERNATIONAL JOURNAL OF APPLIED LINGUISTICS, 2019, 16 : 53 - 80
  • [33] A Two-Step Approach for Improving Sentiment Classification Accuracy
    Azam, Muhammad
    Ahmed, Tanvir
    Ahmad, Rehan
    Rehman, Ateeq Ur
    Sabah, Fahad
    Asif, Rao Muhammad
    [J]. INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2021, 30 (03): : 853 - 867
  • [34] Improving arabic information retrieval system using n-gram method
    Legal Informatics center, Lebanese University, Sami Solh Street-Bp5396/116, Lebanon
    不详
    不详
    [J]. WSEAS Trans. Comput., 4 (125-133):
  • [35] A SHAPE FEATURE BASED BOVW METHOD FOR IMAGE CLASSIFICATION USING N-GRAM AND SPATIAL PYRAMID CODING SCHEME
    Etemad, Elham
    Hu, Gang
    Gao, Qigang
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 504 - 508
  • [36] Website Classification Using Word Based Multiple N-Gram Models And Random Search Oriented Feature Parameters
    Shawon, Ashadullah
    Zuhori, Syed Tauhid
    Mahmud, Firoz
    Rahman, Md Jamil-Ur
    [J]. 2018 21ST INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2018,
  • [37] N-gram Based Sentiment Mining for Bangla Text Using Support Vector Machine
    Abu Taher, S. M.
    Akhter, Kazi Afsana
    Hasan, K. M. Azharul
    [J]. 2018 INTERNATIONAL CONFERENCE ON BANGLA SPEECH AND LANGUAGE PROCESSING (ICBSLP), 2018,
  • [38] Malayalam OCR: N-gram approach Using SVM Classifier
    Jia, Ashitta T.
    Ayappally, Yahkoob
    Syama, K.
    [J]. 2013 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2013, : 1799 - 1803
  • [39] N-Gram Pattern Recognition using Multivariate-Bernoulli Model with Smoothing Methods for Text Classification
    Kilimci, Zeynep Hilal
    Akyokus, Selim
    [J]. 2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU), 2016, : 597 - 600
  • [40] Implementation of Machine Learning Algorithms in Arabic Sentiment Analysis Using N-Gram Features
    Gamal, Donia
    Alfonse, Marco
    El-Horbaty, El-Sayed M.
    Salem, Abdel-Badeeh M.
    [J]. PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE OF INFORMATION AND COMMUNICATION TECHNOLOGY [ICICT-2019], 2019, 154 : 332 - 340