An intelligent model based on integrated inverse document frequency and multinomial Naive Bayes for current affairs news categorisation

被引:5
|
作者
Kumar, Sachin [1 ]
Sharma, Aditya [1 ]
Reddy, B. Kartheek [1 ]
Sachan, Shreyas [1 ]
Jain, Vaibhav [1 ]
Singh, Jagvinder [2 ]
机构
[1] Univ Univ, Cluster Innovat Ctr, Delhi, India
[2] Delhi Technol Univ, Dept Management, Delhi, India
关键词
News Articles; Classification; Intelligent Methods; Machine Learning; Support Vector Machine; Multinomial Naive Bayes; Inverse Document Frequency(IDF); SENTIMENT ANALYSIS; TEXT CLASSIFICATION; EXTRACTION METHODS; DECISION-SUPPORT; MACHINE; TECHNOLOGY;
D O I
10.1007/s13198-021-01471-7
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Digital technologies, their product and services have empowered the masses to generate information at a faster pace. Digital technologies based information sharing platforms such as news websites and social media platforms such as Facebook, Twitter, Instagram, What's app etc have flooded the information space due to the easy generation of information and dissemination to the masses instantly. Information classification has been an important task, especially in newspapers and media organisations. In another area also, information or text classification has an important role to play so that important and vital information can be classified based on the already predefined categories. In journalism, editors and resources persons were allocated the task to recognise and classify the news stories so that they can be placed in the predefined categories of economy and business news, political news, social news, editorial section, education and career, and sports information etc. Nowadays the process of classification and segregation of textual information has become challenging due to the flow of diverse, vast information. Additionally, the pace of information and its updates, access and competition among the media House have made it more challenging. Hence automated and intelligent tools which can classify the information and text accurately and efficiently is needed to reduces human efforts, time and increase productivity. This paper presents an intelligent, efficient and robust intelligent machine learning model based on Multinomial Naive Bayes(MNB) to classify the current affairs news stories. The proposed Inverse Document Frequency(IDF) integrated MNB model achieves classification accuracy of 87.22 per cent. The experiment results are also compared with other machine learning models such as Logistics Regression(LR), Support Vector Machine(SVM), K-Nearest Neighbours(KNN) and Random forest(RF). The results demonstrate that the presented model is better in term of accuracy and may be deployed in real world information classification and media domain to improve the productivity, efficiency of the current affairs news classification process.
引用
下载
收藏
页码:1341 / 1355
页数:15
相关论文
共 8 条
  • [1] An intelligent model based on integrated inverse document frequency and multinomial Naive Bayes for current affairs news categorisation
    Sachin Kumar
    Aditya Sharma
    B Kartheek Reddy
    Shreyas Sachan
    Vaibhav Jain
    Jagvinder Singh
    International Journal of System Assurance Engineering and Management, 2022, 13 : 1341 - 1355
  • [2] An online dynamic security assessment integrated scheme for power systems based on sparse multinomial naive bayes and canonical correlation forest
    Liu, Songkai
    Li, Zhenghao
    Li, Shichun
    Li, Zhenxing
    Miao, Shuwei
    Zhang, Yating
    Liu, Shuchi
    Hu, Jingzhe
    Ruan, Zhaohua
    Xiao, Maoxiang
    Sun, Jinming
    Cui, Ziqi
    Yang, Mingfei
    Zhou, Qian
    Zhao, Wenbo
    SUSTAINABLE ENERGY GRIDS & NETWORKS, 2024, 39
  • [3] Synthesis of Compound Facial Expressions Based on Indonesian Sentences Using Multinomial Naive Bayes Model and Dominance Threshold Equations
    Aripin
    Haryanto, Hanny
    Agastya, Wisnu
    ENGINEERING LETTERS, 2022, 30 (01) : 50 - 59
  • [4] iCACD: an intelligent deep learning model to categorise current affairs news article for efficient journalistic process
    Kumar, Sachin
    Panwar, Shivam
    Singh, Jagvinder
    Sharma, Anuj Kumar
    Nisha, Zairu
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2022, 13 (05) : 2572 - 2582
  • [5] iCACD: an intelligent deep learning model to categorise current affairs news article for efficient journalistic process
    Sachin Kumar
    Shivam Panwar
    Jagvinder Singh
    Anuj Kumar Sharma
    Zairu Nisha
    International Journal of System Assurance Engineering and Management, 2022, 13 : 2572 - 2582
  • [6] Automated Document Classification for News Article in Bahasa Indonesia based on Term Frequency Inverse Document Frequency (TF-IDF) Approach
    Hakim, An Aulia
    Erwin, Alva
    Eng, Kho I.
    Galinium, Maulahikmah
    Muliady, Wahyu
    2014 6TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND ELECTRICAL ENGINEERING (ICITEE), 2014, : 29 - 32
  • [7] An Improved Fake News Detection Model Using Hybrid Time Frequency-Inverse Document Frequency for Feature Extraction and AdaBoost Ensemble Model as a Classifier
    Holla, Lakshmi
    Kavitha, K. S.
    JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2024, 15 (02) : 202 - 211
  • [8] Predicting stock trend using an integrated term frequency-inverse document frequency-based feature weight matrix with neural networks
    Thakkar, Ankit
    Chaudhari, Kinjal
    APPLIED SOFT COMPUTING, 2020, 96 (96)