KNN with TF-IDF Based Framework for Text Categorization

被引:151
|
作者
Trstenjak, Bruno [1 ]
Mikac, Sasa [2 ]
Donko, Dzenana [3 ]
机构
[1] Medimurje Univ Appl Sci Cakovec, Dept Comp Engn, Cakovec, Croatia
[2] Fac Elect Engn & Comp Sci, Dept Comp Sci, Maribor, Slovenia
[3] Fac Elect Engn, Dept Comp Sci, Sarajevo, Bosnia & Herceg
来源
24TH DAAAM INTERNATIONAL SYMPOSIUM ON INTELLIGENT MANUFACTURING AND AUTOMATION, 2013 | 2014年 / 69卷
关键词
text documents classification; K-Nearest Neighbor; TF-IDF; framework; machine learning;
D O I
10.1016/j.proeng.2014.03.129
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
KNN is a very popular algorithm for text classification. This paper presents the possibility of using KNN algorithm with TF-IDF method and framework for text classification. Framework enables classification according to various parameters, measurement and analysis of results. Evaluation of framework was focused on the speed and quality of classification. The results of testing showed the good and bad features of algorithm, providing guidance for the further development of similar frameworks. (C) 2014 The Authors. Published by Elsevier Ltd.
引用
收藏
页码:1356 / 1364
页数:9
相关论文
共 50 条
  • [31] Internet Articles Classification by Industry Types Based on TF-IDF
    Cha, Jonghun
    Lee, Jee-Hyong
    ADVANCES IN COMPUTER SCIENCE AND UBIQUITOUS COMPUTING, 2018, 474 : 1121 - 1125
  • [32] Estimating the selectivity of tf-idf based cosine similarity predicates
    Tata, Sandeep
    Patel, Jignesh M.
    SIGMOD RECORD, 2007, 36 (02) : 7 - 12
  • [33] An Identification Method of News Scientific Intelligence Based on TF-IDF
    Pan, Lu
    Tang, Haibo
    Zhou, Lei
    Wang, Liuyang
    Zhu, Quanyin
    14TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS FOR BUSINESS, ENGINEERING AND SCIENCE (DCABES 2015), 2015, : 501 - 504
  • [34] Research on Sentiment Analysis of Microblogging Based on LSA and TF-IDF
    Li, Yingying
    Shen, Bo
    PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 2584 - 2588
  • [35] Entity extraction based on the combination of information entropy and TF-IDF
    Yilahun H.
    Hamdulla A.
    International Journal of Reasoning-based Intelligent Systems, 2023, 15 (01) : 71 - 78
  • [36] Document Categorization with Entropy based TF/IDF classifier
    Lu, Yi-hong
    Huang, Yan
    PROCEEDINGS OF THE 2009 WRI GLOBAL CONGRESS ON INTELLIGENT SYSTEMS, VOL IV, 2009, : 269 - +
  • [37] Hot Topic Detection Based on a Refined TF-IDF Algorithm
    Zhu, Zhiliang
    Liang, Jie
    Li, Deyang
    Yu, Hai
    Liu, Guoqi
    IEEE ACCESS, 2019, 7 : 26996 - 27007
  • [38] Sentiment Enhanced Hybrid TF-IDF for Microblogs
    Simsek, Atakan
    Karagoz, Pinar
    2014 IEEE FOURTH INTERNATIONAL CONFERENCE ON BIG DATA AND CLOUD COMPUTING (BDCLOUD), 2014, : 311 - 317
  • [39] Using TF-IDF to hide sensitive itemsets
    Tzung-Pei Hong
    Chun-Wei Lin
    Kuo-Tung Yang
    Shyue-Liang Wang
    Applied Intelligence, 2013, 38 : 502 - 510
  • [40] Using TF-IDF to hide sensitive itemsets
    Hong, Tzung-Pei
    Lin, Chun-Wei
    Yang, Kuo-Tung
    Wang, Shyue-Liang
    APPLIED INTELLIGENCE, 2013, 38 (04) : 502 - 510