An improved TF-IDF approach for text classification

被引:8
|
作者
Zhang Yun-tao
Gong Ling
Wang Yong-cheng
机构
[1] Shanghai Jiaotong University,Network & Information Center
[2] Shanghai Jiaotong University,School of Electronic & Information Technology
来源
关键词
Term frequency/inverse document frequency (TF-IDF); Text classification; Confidence; Support; Characteristic words; A; TP31;
D O I
10.1631/BF02842477
中图分类号
学科分类号
摘要
This paper presents a new improved term frequency/inverse document frequency (TF-IDF) approach which uses confidence, support and characteristic words to enhance the recall and precision of text classification. Synonyms defined by a lexicon are processed in the improved TF-IDF approach. We detailedly discuss and analyze the relationship among confidence, recall and precision. The experiments based on science and technology gave promising results that the new TF-IDF approach improves the precision and recall of text classification compared with the conventional TF-IDF approach.
引用
收藏
页码:49 / 55
页数:6
相关论文
共 50 条
  • [1] An improved TF-IDF approach for text classification
    张云涛
    龚玲
    王永成
    [J]. Journal of Zhejiang University-Science A(Applied Physics & Engineering), 2005, (01) : 50 - 56
  • [2] Application of an Improved TF-IDF Method in Literary Text Classification
    Xiang, Lin
    [J]. ADVANCES IN MULTIMEDIA, 2022, 2022
  • [3] Research of Text Classification Based on Improved TF-IDF Algorithm
    Liu, Cai-zhi
    Sheng, Yan-xiu
    Wei, Zhi-qiang
    Yang, Yong-Quan
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE OF INTELLIGENT ROBOTICS AND CONTROL ENGINEERING (IRCE), 2018, : 218 - 222
  • [4] Turning from TF-IDF to TF-IGM for term weighting in text classification
    Chen, Kewen
    Zhang, Zuping
    Long, Jun
    Zhang, Hao
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2016, 66 : 245 - 260
  • [5] A Novel Text Mining Approach Based on TF-IDF and Support Vector Machine for News Classification
    Dadgar, Seyyed Mohammad Hossein
    Araghi, Mohammad Shirzad
    Farahani, Morteza Mastery
    [J]. PROCEEDINGS OF 2ND IEEE INTERNATIONAL CONFERENCE ON ENGINEERING & TECHNOLOGY ICETECH-2016, 2016, : 112 - 116
  • [6] Emotion Analysis in Text using TF-IDF
    Sundaram, Varun
    Ahmed, Saad
    Muqtadeer, Shaik Abdul
    Reddy, R. Ravinder
    [J]. 2021 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2021), 2021, : 292 - 297
  • [7] Research on aviation unsafe incidents classification with improved TF-IDF algorithm
    Wang, Yanhua
    Zhang, Zhiyuan
    Huo, Weigang
    [J]. MODERN PHYSICS LETTERS B, 2016, 30 (12):
  • [8] Document Clustering: TF-IDF approach
    Bafna, Prafulla
    Pramod, Dhanya
    Vaidya, Anagha
    [J]. 2016 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, AND OPTIMIZATION TECHNIQUES (ICEEOT), 2016, : 61 - 66
  • [9] Topological Data Analysis In Text Classification Based On Word Embedding And TF-IDF
    Wen, Xiaoyang
    [J]. 2020 3RD INTERNATIONAL CONFERENCE ON COMPUTER INFORMATION SCIENCE AND APPLICATION TECHNOLOGY (CISAT) 2020, 2020, 1634
  • [10] An Automatic Text Summary Extraction Method Based on Improved TextRank and TF-IDF
    Guan, Xinxin
    Li, Yeli
    Zeng, Qingtao
    Zhou, Chufeng
    [J]. 2019 INTERNATIONAL CONFERENCE ON ADVANCED ELECTRONIC MATERIALS, COMPUTERS AND MATERIALS ENGINEERING (AEMCME 2019), 2019, 563