Structural and semantic aspects of similarity of Document Type Definitions and XML schemas

被引:19
|
作者
Wojnar, Ales [1 ]
Mlynkova, Irena [1 ]
Dokulil, Jiri [1 ]
机构
[1] Charles Univ Prague, Dept Software Engn, Fac Math & Phys, CR-11800 Prague 1, Czech Republic
关键词
XML schema; DTD; XSD; Similarity; Data semantics; Structural analysis; PERFORMANCE; METHODOLOGY; ALGORITHM;
D O I
10.1016/j.ins.2009.12.024
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The natural optimization strategy for XML-to-relational mapping methods is exploitation of similarity of XML data. However, none of the current similarity evaluation approaches is suitable for this purpose. While the key emphasis is currently put on semantic similarity of XML data, the main aspect of XML-to-relational mapping methods is analysis of their structure. In this paper we propose an approach that utilizes a verified strategy for structural similarity evaluation - tree edit distance - to DTD constructs. This approach is able to cope with the fact that DTDs involve several types of nodes and can form general graphs. In addition, it is optimized for the specific features of XML data and, if required, it enables one to exploit the semantics of element/attribute names. Using a set of experiments we show the impact of these extensions on similarity evaluation. And, finally, we discuss how this approach can be extended for XSDs, which involve plenty of "syntactic sugar", i.e. constructs that are structurally or semantically equivalent. (C) 2010 Elsevier Inc. All rights reserved.
引用
下载
收藏
页码:1817 / 1836
页数:20
相关论文
共 50 条
  • [1] A layered approach to semantic similarity analysis of XML schemas
    Kim, Jaewook
    Peng, Yun
    Kulvatunyou, Serm
    Ivezic, Nenad
    Jones, Albert
    PROCEEDINGS OF THE 2008 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2008, : 274 - +
  • [2] Semantic-based similarity computation for XML document
    Song, In-sang
    Paik, Ju-ryun
    Kim, Ung-mo
    MUE: 2007 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND UBIQUITOUS ENGINEERING, PROCEEDINGS, 2007, : 796 - +
  • [3] Resolving structural conflicts in the integration of XML Schemas: A semantic approach
    Yang, X
    Lee, ML
    Ling, TW
    CONCEPTUAL MODELING - ER 2003, PROCEEDINGS, 2003, 2813 : 520 - 533
  • [4] Semantic Structural Similarity for Clustering XML Documents
    Kim, Tae-Soon
    Lee, Ju-Hong
    Song, Jae-Won
    ICHIT 2008: INTERNATIONAL CONFERENCE ON CONVERGENCE AND HYBRID INFORMATION TECHNOLOGY, PROCEEDINGS, 2008, : 552 - 557
  • [5] XML query processing using document type definitions
    Chung, TS
    Kim, HJ
    JOURNAL OF SYSTEMS AND SOFTWARE, 2002, 64 (03) : 195 - 205
  • [6] Matching of enhanced XML schemas with a measure of structural-context similarity
    Zerdazi, Amar
    Lamolle, Myriam
    WEBIST 2007: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES, VOL IT: INTERNET TECHNOLOGY, 2007, : 128 - +
  • [7] Semantic Structural Similarity Measure for Clustering XML Documents
    Song, Ling
    Ma, Jun
    Lei, Jingsheng
    Zhang, Dongmei
    Wang, Zhen
    WEB INFORMATION SYSTEMS AND MINING, PROCEEDINGS, 2009, 5854 : 232 - +
  • [8] Similarity of XML Schema Definitions
    Mlynkova, Irena
    DOCENG'08: PROCEEDINGS OF THE EIGHTH ACM SYMPOSIUM ON DOCUMENT ENGINEERING, 2008, : 187 - 190
  • [9] Semantic and structural similarities between XML Schemas for integration of ubiquitous healthcare data
    Pham Thu Thu Thuy
    Lee, Young-Koo
    Lee, Sungyoung
    PERSONAL AND UBIQUITOUS COMPUTING, 2013, 17 (07) : 1331 - 1339
  • [10] Semantic and structural similarities between XML Schemas for integration of ubiquitous healthcare data
    Pham Thu Thu Thuy
    Young-Koo Lee
    Sungyoung Lee
    Personal and Ubiquitous Computing, 2013, 17 : 1331 - 1339