Research of Text Classification Based on Improved TF-IDF Algorithm

被引:0
|
作者
Liu, Cai-zhi [1 ]
Sheng, Yan-xiu [1 ]
Wei, Zhi-qiang [1 ]
Yang, Yong-Quan [1 ]
机构
[1] Ocean Univ China, Coll Informat Sci & Engn, Qingdao, Peoples R China
关键词
text classification; text representation; TF-IDF; Word2vec model;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, with the rapid development of Internet Technology, text data is growing rapidly every day. Users need to filter out the information they need from a large amount of text. Therefore, automatic text classification technology can help users find information. In order to address problems, such as ignoring contextual semantic links and different vocabulary importance in traditional text classification techniques, this paper presents a vector representation of feature words based on the deep learning tool Word2vec, and the weight of the feature words is calculated by the improved TF-IDF algorithm. By multiplying the weight of the word and the word vector, the vector representation of the word is realized. Finally, each text is represented by accumulating all the word vectors. Thus, text classification is carried out.
引用
收藏
页码:218 / 222
页数:5
相关论文
共 50 条
  • [21] Emotion Analysis in Text using TF-IDF
    Sundaram, Varun
    Ahmed, Saad
    Muqtadeer, Shaik Abdul
    Reddy, R. Ravinder
    2021 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2021), 2021, : 292 - 297
  • [22] Design and Research of Intelligent Work Order System Based on TF-IDF Algorithm
    Liu, Chen
    Guo, Qingji
    Zhu, Yin
    Liu, Jia
    PROCEEDINGS OF 2024 3RD INTERNATIONAL CONFERENCE ON CRYPTOGRAPHY, NETWORK SECURITY AND COMMUNICATION TECHNOLOGY, CNSCT 2024, 2024, : 351 - 355
  • [23] Research on Text Similarity Measurement Hybrid Algorithm with Term Semantic Information and TF-IDF Method
    Lan, Fei
    ADVANCES IN MULTIMEDIA, 2022, 2022
  • [24] Research on case reasoning method based on TF-IDF
    Lin Zhang
    International Journal of System Assurance Engineering and Management, 2021, 12 : 608 - 615
  • [25] Research on case reasoning method based on TF-IDF
    Zhang, Lin
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2021, 12 (03) : 608 - 615
  • [26] Text Classification Using Novel Term Weighting Scheme-Based Improved TF-IDF for Internet Media Reports
    Jiang, Zhiying
    Gao, Bo
    He, Yanlin
    Han, Yongming
    Doyle, Paul
    Zhu, Qunxiong
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021
  • [27] Text classification algorithm of tourist attractions subcategories with modified TF-IDF and Word2Vec
    Xiao, Lu
    Li, Qiaoxing
    Ma, Qian
    Shen, Jiasheng
    Yang, Yong
    Li, Danyang
    PLOS ONE, 2024, 19 (10):
  • [28] A Chinese Short Text Classification Method Based on TF-IDF and Gradient Boosting Decision Tree
    Cheng, Yanming
    Yu, Zhigang
    Hu, Je
    Yang, Mingchuan
    2022 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, COMPUTER VISION AND MACHINE LEARNING (ICICML), 2022, : 164 - 168
  • [29] Student sentiment classification model based on GRU neural network and TF-IDF algorithm
    Yu, Hailong
    Ji, Yannan
    Li, Qinglin
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (02) : 2301 - 2311
  • [30] A Novel Text Mining Approach Based on TF-IDF and Support Vector Machine for News Classification
    Dadgar, Seyyed Mohammad Hossein
    Araghi, Mohammad Shirzad
    Farahani, Morteza Mastery
    PROCEEDINGS OF 2ND IEEE INTERNATIONAL CONFERENCE ON ENGINEERING & TECHNOLOGY ICETECH-2016, 2016, : 112 - 116