KNN with TF-IDF Based Framework for Text Categorization

被引:151
|
作者
Trstenjak, Bruno [1 ]
Mikac, Sasa [2 ]
Donko, Dzenana [3 ]
机构
[1] Medimurje Univ Appl Sci Cakovec, Dept Comp Engn, Cakovec, Croatia
[2] Fac Elect Engn & Comp Sci, Dept Comp Sci, Maribor, Slovenia
[3] Fac Elect Engn, Dept Comp Sci, Sarajevo, Bosnia & Herceg
关键词
text documents classification; K-Nearest Neighbor; TF-IDF; framework; machine learning;
D O I
10.1016/j.proeng.2014.03.129
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
KNN is a very popular algorithm for text classification. This paper presents the possibility of using KNN algorithm with TF-IDF method and framework for text classification. Framework enables classification according to various parameters, measurement and analysis of results. Evaluation of framework was focused on the speed and quality of classification. The results of testing showed the good and bad features of algorithm, providing guidance for the further development of similar frameworks. (C) 2014 The Authors. Published by Elsevier Ltd.
引用
收藏
页码:1356 / 1364
页数:9
相关论文
共 50 条
  • [1] Naive Bayes Text Categorization Algorithm Based on TF-IDF Attribute Weighting
    Jiang, Feng
    Zhang, Zhenghao
    Chen, Ping
    Liu, Yongrui
    PROCEEDINGS OF 2018 THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE (CSAI 2018) / 2018 THE 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND MULTIMEDIA TECHNOLOGY (ICIMT 2018), 2018, : 521 - 525
  • [2] An Improved TF-IDF Algorithm Based on Class Discriminative Strength for Text Categorization on Desensitized Data
    Zhang, Ting
    Ge, Shuzhi Sam
    3RD INTERNATIONAL CONFERENCE ON INNOVATION IN ARTIFICIAL INTELLIGENCE (ICIAI 2019), 2019, : 39 - 44
  • [3] A Method of Text Dimension Reduction Based on CHI and TF-IDF
    Tang, HaiBo
    Zhou, Lei
    Xu Chengjie
    Zhu, Quanyin
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON MECHATRONICS, MATERIALS, CHEMISTRY AND COMPUTER ENGINEERING 2015 (ICMMCCE 2015), 2015, 39 : 1854 - 1857
  • [4] Emotion Analysis in Text using TF-IDF
    Sundaram, Varun
    Ahmed, Saad
    Muqtadeer, Shaik Abdul
    Reddy, R. Ravinder
    2021 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2021), 2021, : 292 - 297
  • [5] An improved TF-IDF approach for text classification
    Zhang Yun-tao
    Gong Ling
    Wang Yong-cheng
    Journal of Zhejiang University-SCIENCE A, 2005, 6 (1): : 49 - 55
  • [6] Research of Text Classification Based on Improved TF-IDF Algorithm
    Liu, Cai-zhi
    Sheng, Yan-xiu
    Wei, Zhi-qiang
    Yang, Yong-Quan
    2018 IEEE INTERNATIONAL CONFERENCE OF INTELLIGENT ROBOTICS AND CONTROL ENGINEERING (IRCE), 2018, : 218 - 222
  • [7] An improved TF-IDF approach for text classification
    张云涛
    龚玲
    王永成
    Journal of Zhejiang University Science A(Science in Engineering), 2005, (01) : 50 - 56
  • [8] An Automatic Text Summary Extraction Method Based on Improved TextRank and TF-IDF
    Guan, Xinxin
    Li, Yeli
    Zeng, Qingtao
    Zhou, Chufeng
    2019 INTERNATIONAL CONFERENCE ON ADVANCED ELECTRONIC MATERIALS, COMPUTERS AND MATERIALS ENGINEERING (AEMCME 2019), 2019, 563
  • [9] Topological Data Analysis In Text Classification Based On Word Embedding And TF-IDF
    Wen, Xiaoyang
    2020 3RD INTERNATIONAL CONFERENCE ON COMPUTER INFORMATION SCIENCE AND APPLICATION TECHNOLOGY (CISAT) 2020, 2020, 1634
  • [10] Application of an Improved TF-IDF Method in Literary Text Classification
    Xiang, Lin
    ADVANCES IN MULTIMEDIA, 2022, 2022