KNN with TF-IDF Based Framework for Text Categorization

被引:151
|
作者
Trstenjak, Bruno [1 ]
Mikac, Sasa [2 ]
Donko, Dzenana [3 ]
机构
[1] Medimurje Univ Appl Sci Cakovec, Dept Comp Engn, Cakovec, Croatia
[2] Fac Elect Engn & Comp Sci, Dept Comp Sci, Maribor, Slovenia
[3] Fac Elect Engn, Dept Comp Sci, Sarajevo, Bosnia & Herceg
来源
24TH DAAAM INTERNATIONAL SYMPOSIUM ON INTELLIGENT MANUFACTURING AND AUTOMATION, 2013 | 2014年 / 69卷
关键词
text documents classification; K-Nearest Neighbor; TF-IDF; framework; machine learning;
D O I
10.1016/j.proeng.2014.03.129
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
KNN is a very popular algorithm for text classification. This paper presents the possibility of using KNN algorithm with TF-IDF method and framework for text classification. Framework enables classification according to various parameters, measurement and analysis of results. Evaluation of framework was focused on the speed and quality of classification. The results of testing showed the good and bad features of algorithm, providing guidance for the further development of similar frameworks. (C) 2014 The Authors. Published by Elsevier Ltd.
引用
收藏
页码:1356 / 1364
页数:9
相关论文
共 50 条
  • [21] Research on case reasoning method based on TF-IDF
    Zhang, Lin
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2021, 12 (03) : 608 - 615
  • [22] Text-based domain ontology building using Tf-Idf and metric clusters techniques
    Rezgui, Yacine
    KNOWLEDGE ENGINEERING REVIEW, 2007, 22 (04): : 379 - 403
  • [23] A Chinese Short Text Classification Method Based on TF-IDF and Gradient Boosting Decision Tree
    Cheng, Yanming
    Yu, Zhigang
    Hu, Je
    Yang, Mingchuan
    2022 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, COMPUTER VISION AND MACHINE LEARNING (ICICML), 2022, : 164 - 168
  • [24] A Novel Text Mining Approach Based on TF-IDF and Support Vector Machine for News Classification
    Dadgar, Seyyed Mohammad Hossein
    Araghi, Mohammad Shirzad
    Farahani, Morteza Mastery
    PROCEEDINGS OF 2ND IEEE INTERNATIONAL CONFERENCE ON ENGINEERING & TECHNOLOGY ICETECH-2016, 2016, : 112 - 116
  • [25] Document Clustering: TF-IDF approach
    Bafna, Prafulla
    Pramod, Dhanya
    Vaidya, Anagha
    2016 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, AND OPTIMIZATION TECHNIQUES (ICEEOT), 2016, : 61 - 66
  • [26] The hypergeometric test performs comparably to TF-IDF on standard text analysis tasks
    Sheridan, Paul
    Onsjo, Mikael
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (10) : 28875 - 28890
  • [27] The hypergeometric test performs comparably to TF-IDF on standard text analysis tasks
    Paul Sheridan
    Mikael Onsjö
    Multimedia Tools and Applications, 2024, 83 : 28875 - 28890
  • [28] Deriving TF-IDF as a Fisher kernel
    Elkan, Charles
    STRING PROCESSING AND INFORMATION RETRIEVAL, PROCEEDINGS, 2005, 3772 : 295 - 300
  • [29] A text similarity measurement combining word semantic information with TF-IDF method
    Huang C.-H.
    Yin J.
    Hou F.
    Jisuanji Xuebao/Chinese Journal of Computers, 2011, 34 (05): : 856 - 864
  • [30] TF-IDF based loop closure detection algorithm for SLAM
    Dong R.
    Liu C.
    Yang G.
    Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition), 2019, 49 (02): : 251 - 258