Structure-Preserving Hashing for Tree-Structured Data

被引:1
|
作者
Xu, Zhi [1 ]
Niu, Lushuai [1 ]
Ji, Jianqiu [1 ,2 ]
Li, Qinlin [1 ]
机构
[1] Guilin Univ Elect Technol, Sch Comp Informat & Secur, Guilin, Peoples R China
[2] Doodod Technol Co Ltd, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Hashing; Unordered trees; Signature; Subpath; EDIT DISTANCE;
D O I
10.1007/s11760-022-02166-7
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Many kinds of data are tree-structured, e.g., XML documents. In this paper, a structure-preserving hashing method for rooted unordered trees is proposed, to compress a tree into compact signature while preserving its structural information, enabling efficient structural similarity estimation and search, duplication detection, etc. The proposed method exploits subpaths of fixed length in a tree. We provide theoretical analysis, showing that under moderate conditions, the signature contains enough information to reconstruct the original tree. And with the signature, similarity between trees can be estimated efficiently. Our proposed method has the advantage of a linear construction time complexity, compared to the quadratic worst-case construction time complexity of the embedded pivot method [24]. A quantitative analysis of the relation to tree edit distance is also provided. Experiments of XML document de-duplication are tested on real world data, showing the effectiveness of the proposed method.
引用
收藏
页码:2045 / 2053
页数:9
相关论文
共 50 条
  • [1] Structure-Preserving Hashing for Tree-Structured Data
    Zhi Xu
    Lushuai Niu
    Jianqiu Ji
    Qinlin Li
    [J]. Signal, Image and Video Processing, 2022, 16 : 2045 - 2053
  • [2] Hashing Tree-Structured Data: Methods and Applications
    Tatikonda, Shirish
    Parthasarathy, Srinivasan
    [J]. 26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING ICDE 2010, 2010, : 429 - 440
  • [3] Structure-Preserving Smooth Projective Hashing
    Blazy, Olivier
    Chevalier, Celine
    [J]. ADVANCES IN CRYPTOLOGY - ASIACRYPT 2016, PT II, 2016, 10032 : 339 - 369
  • [4] Clustering of Tree-structured Data
    Lu, Na
    Wu, Yidan
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, 2015, : 1210 - 1215
  • [5] Watermarking abstract tree-structured data
    Chen, G
    Chen, K
    Hu, TL
    Dong, JX
    [J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2005, 3739 : 221 - 232
  • [6] OODA of graph and tree-structured data
    Sienkiewicz, Ela
    Wang, Haonan
    [J]. BIOMETRICAL JOURNAL, 2014, 56 (05) : 778 - 780
  • [7] Anonymizing Collections of Tree-Structured Data
    Gkountouna, Olga
    Terrovitis, Manolis
    [J]. 2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 1520 - 1521
  • [8] Anonymizing Collections of Tree-Structured Data
    Gkountouna, Olga
    Terrovitis, Manolis
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (08) : 2034 - 2048
  • [9] Substructure search with tree-structured data
    Ozawa, K
    Yasuda, T
    Fujita, S
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1997, 37 (04): : 688 - 695
  • [10] Tree-structured Clustering for Continuous Data
    Huh, Myung-Hoe
    Yang, Kyung-Sook
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2005, 18 (03) : 661 - 671