Determining Cross-Polytope Locality-Sensitive Hashing Parameters for Code Clone Detection.

被引:0
|
作者
Tokui S. [1 ]
Inoue K. [1 ]
Yoshida N. [2 ]
Choi E. [3 ]
机构
[1] Osaka University, Japan
[2] Nagoya University, Japan
[3] Kyoto Institute of Technology, Japan
关键词
D O I
10.11309/jssst.38.4_60
中图分类号
学科分类号
摘要
A code clone is a code fragment that has identical or similar code fragments to it in the source code. A code clone detector CCVolti has been developed using Cross-Polytope Locality-Sensitive Hashing (LSH). CCVolti can detect not only syntactic clones but also semantic clones, which are difficult to be detected. However, CCVolti has two problems: (1) the detection time depends on Cross-Polytope LSH, and (2) several missed code clones. In this study, we propose an approach to determine Cross-Polytope LSH parameters to obtain a target value of recall given by a user and save as much time as possible. The approach builds a linear regression model that learns suitable parameters based on the size of target projects and then determines appropriate Cross-Polytope LSH parameters for a code clone detection target. Finally, we apply this approach with CCVolti to 20 open source software projects and confirm this approach’s effectiveness. © 2021 Japan Society for Software Science and Technology. All rights reserved.
引用
收藏
页码:60 / 82
页数:22
相关论文
共 17 条
  • [1] Optimal Parameters for Locality-Sensitive Hashing
    Slaney, Malcolm
    Lifshits, Yury
    He, Junfeng
    PROCEEDINGS OF THE IEEE, 2012, 100 (09) : 2604 - 2623
  • [2] Learnable Locality-Sensitive Hashing for Video Anomaly Detection
    Lu, Yue
    Cao, Congqi
    Zhang, Yifan
    Zhang, Yanning
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (02) : 963 - 976
  • [3] Fast anomaly detection with locality-sensitive hashing and hyperparameter autotuning
    Meira, Jorge
    Eiras-Franco, Carlos
    Bolon-Canedo, Veronica
    Marreiros, Goreti
    Alonso-Betanzos, Amparo
    INFORMATION SCIENCES, 2022, 607 : 1245 - 1264
  • [4] A Locality-Sensitive Hashing-Based Jamming Detection System for IoT Networks
    Ganeshkumar, P.
    Albalawi, Talal
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 73 (03): : 5943 - 5959
  • [5] Efficient Outlier Detection in Hyperedge Streams Using MinHash and Locality-Sensitive Hashing
    Ranshous, Stephen
    Chaudhary, Mandar
    Samatova, Nagiza F.
    COMPLEX NETWORKS & THEIR APPLICATIONS VI, 2018, 689 : 105 - 116
  • [6] Cross-media retrieval based on locality-sensitive hashing and neural network algorithms
    Bai L.
    Jia Y.
    Wang H.
    Xie Y.
    Yu T.
    2018, National University of Defense Technology (40): : 93 - 98
  • [7] An incremental community detection method for social tagging systems using locality-sensitive hashing
    Wu, Zhenyu
    Zou, Ming
    NEURAL NETWORKS, 2014, 58 : 14 - 28
  • [8] A Machine Learning approach for anomaly detection on the Internet of Things based on Locality-Sensitive Hashing
    Hernandez-Jaimes, Mireya Lucia
    Martinez-Cruz, Alfonso
    Ramirez-Gutierrez, Kelseyalejandra
    INTEGRATION-THE VLSI JOURNAL, 2024, 96
  • [9] Locality-Sensitive Hashing for Earthquake Detection: A Case Study of Scaling Data-Driven Science
    Rong, Kexin
    Yoon, Clara E.
    Bergen, Karianne J.
    Elezabi, Hashem
    Bailis, Peter
    Levis, Philip
    Beroza, Gregory C.
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2018, 11 (11): : 1674 - 1687
  • [10] A Binary-Search-Based Locality-Sensitive Hashing Method for Cross-Site User Identification
    He, Wenqiang
    Li, Yongjun
    Zhang, Yinyin
    Li, Xiangyu
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2023, 10 (02): : 480 - 491