String Similarity Computing Based on Position And Cosine

被引:0
|
作者
Cheng, Na [1 ]
Yu, Zhongqing [1 ,2 ]
Wang, Kaixi [1 ,2 ]
机构
[1] Qingdao Univ, Coll Comp Sci & Technol, Qingdao, Shandong, Peoples R China
[2] Qingdao Univ, Coll Data Sci & Software Engn, Qingdao, Shandong, Peoples R China
关键词
angle cosine; position encoding; approximately duplicate records; data cleaning; products select;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
E-Business platform needs to have the production selection functionalities according to the products' feature and their cost performance, and at the same time, we need to clean data in the production and sale process, so it is important to calculate similarity between products. This paper proposes a new way to compute the similarity of string by segmenting string into words, numbering the corresponding positions and vectorizing the string. Then the similarity between the strings is computed by computing the cosine angle of the two vectors. Experiments show that the method avoids the maximum or minimum of LCS and GST. In addition, the proposed method also improves the accuracy of similarity calculation.
引用
收藏
页码:256 / 261
页数:6
相关论文
共 50 条
  • [1] An Improved Cosine Similarity Algorithm Based on Document Similarity
    Lee, Ming
    Zhao, Heji
    [J]. INTERNATIONAL SYMPOSIUM ON FUZZY SYSTEMS, KNOWLEDGE DISCOVERY AND NATURAL COMPUTATION (FSKDNC 2014), 2014, : 196 - 204
  • [2] The String Similarity Query Processing in Cloud Computing System
    LiaoYuanLai
    [J]. INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2015, 8 (02): : 25 - 35
  • [3] A Quantum Binary Classifier based on Cosine Similarity
    Pastorello, Davide
    Blanzieri, Enrico
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON QUANTUM COMPUTING AND ENGINEERING (QCE 2021) / QUANTUM WEEK 2021, 2021, : 477 - 478
  • [4] Image Recommendation Based on ANOVA Cosine Similarity
    Sejal, D.
    Ganeshsingh, T.
    Venugopal, K. R.
    Iyengar, S. S.
    Patnaik, L. M.
    [J]. TWELFTH INTERNATIONAL CONFERENCE ON COMMUNICATION NETWORKS, ICCN 2016 / TWELFTH INTERNATIONAL CONFERENCE ON DATA MINING AND WAREHOUSING, ICDMW 2016 / TWELFTH INTERNATIONAL CONFERENCE ON IMAGE AND SIGNAL PROCESSING, ICISP 2016, 2016, 89 : 562 - 567
  • [5] Computing Burrows-Wheeler Similarity Distributions for String Collections
    Louza, Felipe A.
    Telles, Guilherme P.
    Gog, Simon
    Zhao, Liang
    [J]. STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2018, 2018, 11147 : 285 - 296
  • [6] Learning similarity with cosine similarity ensemble
    Xia, Peipei
    Zhang, Li
    Li, Fanzhang
    [J]. INFORMATION SCIENCES, 2015, 307 : 39 - 52
  • [7] Content Caching in Mobile Edge Computing Based on User Location and Preferences Using Cosine Similarity and Collaborative Filtering
    Gul-E-Laraib
    Zaman, Sardar Khaliq uz
    Maqsood, Tahir
    Rehman, Faisal
    Mustafa, Saad
    Khan, Muhammad Amir
    Gohar, Neelam
    Algarni, Abeer D.
    Elmannai, Hela
    [J]. ELECTRONICS, 2023, 12 (02)
  • [8] Cosine Similarity - A Computing Approach to Match Similarity Between Higher Education Programs and Job Market Demands Based on Maximum Number of Common Words
    Januzaj, Ylber
    Luma, Artan
    [J]. INTERNATIONAL JOURNAL OF EMERGING TECHNOLOGIES IN LEARNING, 2022, 17 (12) : 258 - 268
  • [9] Point Cloud Registration Algorithm Based on Cosine Similarity
    Zhan Xu
    Cai Yong
    [J]. LASER & OPTOELECTRONICS PROGRESS, 2020, 57 (12)
  • [10] k-Factor-Based Cosine Similarity Measurement
    Siddiqui, Nadia
    Islam, Saiful
    [J]. INFORMATION AND COMMUNICATION TECHNOLOGY FOR INTELLIGENT SYSTEMS, ICTIS 2018, VOL 2, 2019, 107 : 643 - 650