Accelerating Duplicate Data Chunk Recognition Using NN Trained by Locality-Sensitive Hash

被引:0
|
作者
Berman, Amit [1 ]
Birk, Yitzhak [1 ]
Mendelson, Avi [1 ]
机构
[1] Technion Israel Inst Technol, Dept Elect Engn, IL-32000 Haifa, Israel
关键词
Deduplication; Chunking; Cloud Storage; Neural Network; Machine Learning; Locality-Sensitive Hashing;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Deduplication is often used in storage systems in order to save storage space, communication bandwidth, write energy, and recovery and error-protection infrastructure. However, deduplication overhead increases latency and computation energy. Determining whether a data chunk is already stored by comparing signatures constitutes a significant fraction of this deduplication overhead. In this paper, we propose a statistical chunk classifier based on a neural network. Our technique is based on learning the patterns of locality-sensitive hashing of the data. Our experiments show an acceleration of chunk processing, leading to reduction in deduplication overhead.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] A locality-sensitive hash for real vectors
    Neylon, Tyler
    PROCEEDINGS OF THE TWENTY-FIRST ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2010, 135 : 1179 - 1189
  • [2] Locality-Sensitive Hashing Scheme Based on Heap Sort of Hash Bucket
    Fang, Bo
    Hua, Zhongyun
    Huang, Hejiao
    14TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND EDUCATION (ICCSE 2019), 2019, : 5 - 10
  • [3] Spiking Locality-Sensitive Hash: Spiking Computation with Phase Encoding Method
    Wang, Ziru
    Ma, Yongqiang
    Dong, Zhiwei
    Zheng, Nanning
    Ren, Pengju
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018, : 435 - 441
  • [4] ProbMinHash - A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity
    Ertl, Otmar
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (07) : 3491 - 3506
  • [5] Utilizing Locality-Sensitive Hash Learning for Cross-Media Retrieval
    Jia Yuhua
    Bai Liang
    Wang Peng
    Guo Jinlin
    Xie Yuxiang
    Yu Tianyuan
    MULTIMEDIA MODELING (MMM 2017), PT I, 2017, 10132 : 550 - 561
  • [6] Deep Constrained Siamese Hash Coding Network and Load-Balanced Locality-Sensitive Hashing for Near Duplicate Image Detection
    Hu, Weiming
    Fan, Yabo
    Xing, Junliang
    Sun, Liang
    Cai, Zhaoquan
    Maybank, Stephen
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (09) : 4452 - 4464
  • [7] LayerLSH: Rebuilding Locality-Sensitive Hashing Indices by Exploring Density of Hash Values
    Ding, Jiwen
    Liu, Zhuojin
    Zhang, Yanfeng
    Gong, Shufeng
    Yu, Ge
    IEEE ACCESS, 2022, 10 : 69851 - 69865
  • [8] Parallel A-Star Multiple Sequence Alignment with Locality-Sensitive Hash Functions
    Sundfeld, Daniel
    Teodoro, George
    de Melo, Alba Cristina M. A.
    2015 9TH INTERNATIONAL CONFERENCE ON COMPLEX, INTELLIGENT, AND SOFTWARE INTENSIVE SYSTEMS CISIS 2015, 2015, : 342 - 347
  • [9] Using Locality-Sensitive Hashing for SVM Classification of Large Data Sets
    Gonzalez-Lima, Maria D.
    Ludena, Carenne C.
    MATHEMATICS, 2022, 10 (11)
  • [10] Using Locality-sensitive Hashing for Rendezvous Search
    Jiang, Guann-Yng
    Chang, Cheng-Shang
    ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 1743 - 1749