KNN with TF-IDF Based Framework for Text Categorization

被引:151
|
作者
Trstenjak, Bruno [1 ]
Mikac, Sasa [2 ]
Donko, Dzenana [3 ]
机构
[1] Medimurje Univ Appl Sci Cakovec, Dept Comp Engn, Cakovec, Croatia
[2] Fac Elect Engn & Comp Sci, Dept Comp Sci, Maribor, Slovenia
[3] Fac Elect Engn, Dept Comp Sci, Sarajevo, Bosnia & Herceg
来源
24TH DAAAM INTERNATIONAL SYMPOSIUM ON INTELLIGENT MANUFACTURING AND AUTOMATION, 2013 | 2014年 / 69卷
关键词
text documents classification; K-Nearest Neighbor; TF-IDF; framework; machine learning;
D O I
10.1016/j.proeng.2014.03.129
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
KNN is a very popular algorithm for text classification. This paper presents the possibility of using KNN algorithm with TF-IDF method and framework for text classification. Framework enables classification according to various parameters, measurement and analysis of results. Evaluation of framework was focused on the speed and quality of classification. The results of testing showed the good and bad features of algorithm, providing guidance for the further development of similar frameworks. (C) 2014 The Authors. Published by Elsevier Ltd.
引用
收藏
页码:1356 / 1364
页数:9
相关论文
共 50 条
  • [41] Graph based KNN for Text Categorization
    Jo, Taeho
    2018 20TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT), 2018, : 260 - 265
  • [42] A new neutrosophic TF-IDF term weighting for text mining tasks: text classification use case
    Bounabi, Mariem
    Elmoutaouakil, Karim
    Satori, Khalid
    INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2021, 17 (03) : 229 - 249
  • [43] A KNN BASED ALGORITHM FOR TEXT CATEGORIZATION
    Bucar, Joze
    Povh, Janez
    SOR'13 PROCEEDINGS: THE 12TH INTERNATIONAL SYMPOSIUM ON OPERATIONAL RESEARCH IN SLOVENIA, 2013, : 367 - 372
  • [44] Deep Network Learning based on TF-IDF Text Features for Electric Power Speech Text Pre-disposal Method
    Zhao, Xin
    Huang, Changda
    IEIE Transactions on Smart Processing and Computing, 2024, 13 (06): : 622 - 631
  • [45] Novel Curriculum Learning Strategy Using Class-Based TF-IDF for Enhancing Personality Detection in Text
    Kwon, Naae
    Yoo, Yuenkyung
    Lee, Byunghan
    IEEE ACCESS, 2024, 12 : 87873 - 87882
  • [46] An improvement to TF-IDF: Term distribution based term weight algorithm
    Xia T.
    Chai Y.
    Journal of Software, 2011, 6 (03) : 413 - 420
  • [47] News keywords extraction algorithm based on TextRank and classified TF-IDF
    Ao, Xiong
    Yu, Xin
    Liu, Derong
    Tian, Hongkang
    2020 16TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC, 2020, : 1364 - 1369
  • [48] Mining microblog user interests based on TextRank with TF-IDF factor
    Tu Shouzhong
    Huang Minlie
    The Journal of China Universities of Posts and Telecommunications, 2016, (05) : 40 - 46
  • [49] TF-IDF based Scene-Object Relations Correlate With Visual
    Celikkol, Pelin
    Laubrock, Jochen
    Schlangen, David
    ACM SYMPOSIUM ON EYE TRACKING RESEARCH & APPLICATIONS, ETRA 2023, 2023,
  • [50] Collaborative Filtering Recommendation Algorithm Based on TF-IDF and User Characteristics
    Ni, Jianjun
    Cai, Yu
    Tang, Guangyi
    Xie, Yingjuan
    APPLIED SCIENCES-BASEL, 2021, 11 (20):