Research of Text Classification Based on Improved TF-IDF Algorithm

被引:0
|
作者
Liu, Cai-zhi [1 ]
Sheng, Yan-xiu [1 ]
Wei, Zhi-qiang [1 ]
Yang, Yong-Quan [1 ]
机构
[1] Ocean Univ China, Coll Informat Sci & Engn, Qingdao, Peoples R China
关键词
text classification; text representation; TF-IDF; Word2vec model;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, with the rapid development of Internet Technology, text data is growing rapidly every day. Users need to filter out the information they need from a large amount of text. Therefore, automatic text classification technology can help users find information. In order to address problems, such as ignoring contextual semantic links and different vocabulary importance in traditional text classification techniques, this paper presents a vector representation of feature words based on the deep learning tool Word2vec, and the weight of the feature words is calculated by the improved TF-IDF algorithm. By multiplying the weight of the word and the word vector, the vector representation of the word is realized. Finally, each text is represented by accumulating all the word vectors. Thus, text classification is carried out.
引用
收藏
页码:218 / 222
页数:5
相关论文
共 50 条
  • [31] Internet Articles Classification by Industry Types Based on TF-IDF
    Cha, Jonghun
    Lee, Jee-Hyong
    ADVANCES IN COMPUTER SCIENCE AND UBIQUITOUS COMPUTING, 2018, 474 : 1121 - 1125
  • [32] Micro-blog Commercial Word Extraction Based On Improved TF-IDF Algorithm
    Huang, Xing
    Wu, Qing
    2013 IEEE INTERNATIONAL CONFERENCE OF IEEE REGION 10 (TENCON), 2013,
  • [33] Research on Sentiment Classification for Tang Poetry based on TF-IDF and FP-Growth
    Li, Gang
    Li, Jie
    PROCEEDINGS OF 2018 IEEE 3RD ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC 2018), 2018, : 630 - 634
  • [34] TF-IDF based loop closure detection algorithm for SLAM
    Dong R.
    Liu C.
    Yang G.
    Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition), 2019, 49 (02): : 251 - 258
  • [35] Hot Topic Detection Based on a Refined TF-IDF Algorithm
    Zhu, Zhiliang
    Liang, Jie
    Li, Deyang
    Yu, Hai
    Liu, Guoqi
    IEEE ACCESS, 2019, 7 : 26996 - 27007
  • [36] The Research of TF-IDF Recommendation Algorithm of Colleges and Universities' Patent System
    Liu, He
    Li, Ping
    Li, Chenxi
    PROCEEDINGS OF THE 2017 7TH INTERNATIONAL CONFERENCE ON MECHATRONICS, COMPUTER AND EDUCATION INFORMATIONIZATION (MCEI 2017), 2017, 75 : 164 - 169
  • [37] Research on Sentiment Analysis of Microblogging Based on LSA and TF-IDF
    Li, Yingying
    Shen, Bo
    PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 2584 - 2588
  • [38] An Improved TF-IDF algorithm based on word frequency distribution information and category distribution information
    Wu, Haoying
    Yuan, Na
    ICIIP'18: PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION PROCESSING, 2018, : 211 - 215
  • [39] A new neutrosophic TF-IDF term weighting for text mining tasks: text classification use case
    Bounabi, Mariem
    Elmoutaouakil, Karim
    Satori, Khalid
    INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2021, 17 (03) : 229 - 249
  • [40] Arabic Questions Classification Using Modified TF-IDF
    Alammary, Ali Saleh
    IEEE ACCESS, 2021, 9 : 95109 - 95122