An Improved Text Similarity Calculation Algorithm Based On VSM

被引:3
|
作者
Li, Lian
Zhu, AiHong
Su, Tao
机构
关键词
Vector Space Model; text similarity; Cosine; coverage degree;
D O I
10.4028/www.scientific.net/AMR.225-226.1105
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text similarity calculation is a key technology in the fields of text clustering, Web intelligent retrieval and natural language processing etc. Because the traditional text similarity calculation algorithm does not consider the affect of same feature words between texts, sometimes this algorithm may lead to inaccurate results. To solve this problem, this paper gives an improved text similarity calculation algorithm. Considering that the amount of same feature words reflects two texts' similarity in some extent, the improved algorithm adds in the coverage measured parameter, which effectively reduces the interference of texts with lower similarity. The simulation and experimental results verify the improved algorithm's correctness and effectiveness.
引用
收藏
页码:1105 / 1108
页数:4
相关论文
共 50 条
  • [1] An improved text categorization algorithm based on VSM
    Geng, Ji
    Lu, Yunling
    Chen, Wei
    Qin, Zhiguang
    [J]. 2014 IEEE 17TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE), 2014, : 1701 - 1706
  • [2] An Improved News Recommendation Algorithm Based on Text Similarity
    Gao, Yihang
    Zhao, Hui
    Zhou, Qian
    Qiu, Meikang
    Liu, Meiqin
    [J]. 2020 3RD INTERNATIONAL CONFERENCE ON SMART BLOCKCHAIN (SMARTBLOCK), 2020, : 132 - 136
  • [3] Collaborative Filtering Algorithm Based on Improved Similarity Calculation
    Yang Hongmei
    [J]. INFORMATION COMPUTING AND APPLICATIONS, PT I, 2011, 243 : 271 - 276
  • [4] Collaborative Filtering Algorithm Based on Improved Similarity Calculation
    Wang, Zhihe
    Shi, Suping
    Du, Hui
    Wang, Shuyan
    [J]. 2019 15TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS 2019), 2019, : 156 - 160
  • [5] A Book Recommendation Algorithm Based on Improved Similarity Calculation
    Li, Yue
    [J]. 2018 3RD INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE), 2018, : 615 - 618
  • [6] A Chinese Text Similarity Calculation Algorithm Based on DF_LDA
    Zhang, Chao
    Chen, Li
    Li, Qiong
    [J]. PROCEEDINGS OF THE 6TH INTERNATIONAL ASIA CONFERENCE ON INDUSTRIAL ENGINEERING AND MANAGEMENT INNOVATION: CORE THEORY AND APPLICATIONS OF INDUSTRIAL ENGINEERING, VOL 1, 2016, : 627 - 634
  • [7] An Improved Text Retrieval Algorithm Based on Suffix Tree Similarity Measure
    Huang, Cheng-hui
    Yin, Jian
    Han, Dong
    [J]. INFORMATION COMPUTING AND APPLICATIONS, PT 2, 2010, 106 : 150 - +
  • [8] Improved VSM for Incremental Text Classification
    Yang, Zhen
    Lei, Jianjun
    Wang, Jian
    Zhang, Xing
    Guo, Jun
    [J]. INTERNATIONAL ELECTRONIC CONFERENCE ON COMPUTER SCIENCE, 2008, 1060 : 369 - +
  • [9] Mapping Texts Into Graphs: An Improved Text Similarity Algorithm
    Liu, Zuoguo
    Chen, Xiaorong
    [J]. PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, : 1357 - 1361
  • [10] Using boosting mechanism to refine the threshold of VSM-based similarity in text classification
    Diao, LL
    Hu, KY
    Lu, YC
    Shi, CY
    [J]. PROCEEDINGS OF THE 4TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-4, 2002, : 2284 - 2287