Structural similarity between XML documents and DTDs

被引:0
|
作者
Ng, PKL
Ng, VTY
机构
关键词
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The use of XML documents in the Internet continues to grow. Need for the analysis of XML documents from heterogeneous sources is arisen, in which documents would conform to different DTDs. In this paper, we propose a measure on the structural similarity among XML documents and DTDs, which is natural to understand and fast to calculate. The measure is defined as a weighted sum of the local measures of document elements with a weighting scheme based on their subtree sizes. While the local measure of an element is defined as its edit distance against its declaration, viewed as regular expression, in the DTD. Based on our definition, an algorithm for edit distance calculation between a string and a regular expression is proposed, which is modified from the algorithm applied in the regular expression matching problem. The advantage of the measure comes with its natural definition and linear complexity.
引用
收藏
页码:412 / 421
页数:10
相关论文
共 50 条
  • [1] Structural similarity evaluation between XML documents and DTDs
    Tekli, Joe
    Chbeir, Richard
    Yetongnon, Kokou
    [J]. WEB INFORMATION SYSTEMS ENGINEERING - WISE 2007, PROCEEDINGS, 2007, 4831 : 196 - 211
  • [2] Measuring the structural similarity among XML documents and DTDs
    Bertino, Elisa
    Guerrini, Giovanna
    Mesiti, Marco
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2008, 30 (01) : 55 - 92
  • [3] Measuring the structural similarity among XML documents and DTDs
    Elisa Bertino
    Giovanna Guerrini
    Marco Mesiti
    [J]. Journal of Intelligent Information Systems, 2008, 30 : 55 - 92
  • [4] Multivalued Dependencies for XML Documents with DTDs
    Song, Jinling
    Zhao, Wei
    Zhang, Xiubo
    Liu, Guohua
    [J]. WISM: 2009 INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND MINING, PROCEEDINGS, 2009, : 284 - +
  • [5] A kernel method for measuring structural similarity between XML documents
    Jeong, Buhwan
    Lee, Daewon
    Cho, Hyunbo
    Kulvatunyou, Boonserm
    [J]. NEW TRENDS IN APPLIED ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2007, 4570 : 572 - +
  • [6] Functional Dependencies over XML Documents with DTDs
    Hartmann, Sven
    Link, Sebastian
    Schewe, Klaus-Dieter
    [J]. ACTA CYBERNETICA, 2005, 17 (01): : 153 - 171
  • [7] Computing similarity between XML documents for XML mining
    Lee, JW
    Park, SS
    [J]. ENGINEERING KNOWLEDGE IN THE AGE OF THE SEMANTIC WEB, PROCEEDINGS, 2004, 3257 : 492 - 493
  • [8] Using structural similarity for clustering XML documents
    Aitelhadj, Ali
    Boughanem, Mohand
    Mezghiche, Mohamed
    Souam, Fatiha
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 32 (01) : 109 - 139
  • [9] Using structural similarity for clustering XML documents
    Ali Aïtelhadj
    Mohand Boughanem
    Mohamed Mezghiche
    Fatiha Souam
    [J]. Knowledge and Information Systems, 2012, 32 : 109 - 139
  • [10] Semantic Structural Similarity for Clustering XML Documents
    Kim, Tae-Soon
    Lee, Ju-Hong
    Song, Jae-Won
    [J]. ICHIT 2008: INTERNATIONAL CONFERENCE ON CONVERGENCE AND HYBRID INFORMATION TECHNOLOGY, PROCEEDINGS, 2008, : 552 - 557