Research on calculation method of text similarity based on smooth inverse frequency

被引:0
|
作者
Ye Y. [1 ]
Minmin Y. [1 ]
Jiming L. [1 ]
机构
[1] Key Laboratory of E-commerce and Modern Logistics, Chongqing University of Posts and Telecommunications, Chongqing
关键词
Part-of-speech; SIF; Word order similarity; Word2vec;
D O I
10.19682/j.cnki.1005-8885.2020.1007
中图分类号
学科分类号
摘要
In order to improve the accuracy of text similarity calculation, this paper presents a text similarity function part of speech and word order-smooth inverse frequency (PO-SIF) based on sentence vector, which optimizes the classical SIF calculation method in two aspects: part of speech and word order. The classical SIF algorithm is to calculate sentence similarity by getting a sentence vector through weighting and reducing noise. However, the different methods of weighting or reducing noise would affect the efficiency and the accuracy of similarity calculation. In our proposed PO-SIF, the weight parameters of the SIF sentence vector are first updated by the part of speech subtraction factor, to determine the most crucial words. Furthermore, PO-SIF calculates the sentence vector similarity taking into the account of word order, which overcomes the drawback of similarity analysis that is mostly based on the word frequency. The experimental results validate the performance of our proposed PO-SIF on improving the accuracy of text similarity calculation. © 2020, Beijing University of Posts and Telecommunications. All rights reserved.
引用
收藏
页码:56 / 64
页数:8
相关论文
共 50 条
  • [1] Research on calculation method of text similarity based on smooth inverse frequency
    Yuan Ye
    Yu Minmin
    Liu Jiming
    TheJournalofChinaUniversitiesofPostsandTelecommunications, 2020, 27 (02) : 56 - 64
  • [2] Text Similarity Calculation Method based on Ontology Model
    Chi, Tao
    Wang, Hanshi
    Liu, Lizhen
    Song, Wei
    Du, Chao
    2014 INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTERNET OF THINGS (CCIOT), 2014, : 213 - 217
  • [3] A SHORT TEXT SIMILARITY CALCULATION METHOD BASED ON DEEP LEARNING
    Xu, Yong
    Peng, Yunke
    Wang, Hengna
    Wang, Xue'er
    UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2024, 86 (01): : 91 - 104
  • [4] A SHORT TEXT SIMILARITY CALCULATION METHOD BASED ON DEEP LEARNING
    Xu, Yong
    Peng, Yunke
    Wang, Hengna
    Wang, Xue’Er
    UPB Scientific Bulletin, Series C: Electrical Engineering and Computer Science, 2024, 86 (01): : 91 - 104
  • [5] Research on Cross-language Text Similarity Calculation
    Yuan, Sun
    Qian, Zhao
    PROCEEDINGS OF 2015 IEEE 5TH INTERNATIONAL CONFERENCE ON ELECTRONICS INFORMATION AND EMERGENCY COMMUNICATION, 2015, : 423 - 426
  • [6] Research on the Calculation Method of Semantic Similarity Based on Concept Hierarchy
    Wang, Kai
    MACHINE TRANSLATION, 2016, 668 : 113 - 124
  • [7] Similarity Calculation Method of Chinese Short Text Based on Semantic Feature Space
    Pan, Liqiang
    Zhang, Pu
    Xiong, Anping
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2015, 6 (02) : 306 - 310
  • [8] Research on Chinese Sentence Similarity Calculation Method Based on Multiple Features
    Wu, Chenyang
    Wang, Jinbo
    Wang, Xiaohua
    Ma, Yunyun
    PROCEEDINGS OF 2018 IEEE 3RD ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC 2018), 2018, : 1160 - 1165
  • [9] An Improved Text Similarity Calculation Algorithm Based On VSM
    Li, Lian
    Zhu, AiHong
    Su, Tao
    ADVANCED RESEARCH ON AUTOMATION, COMMUNICATION, ARCHITECTONICS AND MATERIALS, PTS 1 AND 2, 2011, 225-226 (1-2): : 1105 - 1108
  • [10] Research on Similarity Calculation Method in Service Composition
    Zhang Jing
    Liu Yanxia
    Xin Jiaoli
    2009 INTERNATIONAL FORUM ON INFORMATION TECHNOLOGY AND APPLICATIONS, VOL 2, PROCEEDINGS, 2009, : 291 - 293