Corpus-based schema matching

被引:0
|
作者
Madhavan, J [1 ]
Bernstein, PA [1 ]
Doan, A [1 ]
Halevy, A [1 ]
机构
[1] Univ Washington, Seattle, WA 98195 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Schema Matching is the problem of identifying corresponding elements in different schemas. Discovering these correspondences or matches is inherently difficult to automate. Past solutions have proposed a principled combination of multiple algorithms. However these solutions sometimes perform rather poorly due to the lack of sufficient evidence in the schemas being matched. In this paper we show how a corpus of schemas and mappings can be used to augment the evidence about the schemas being matched, so they can be matched better Such a corpus typically contains multiple schemas that model similar concepts and hence enables us to learn variations in the elements and their properties. We exploit such a corpus in two ways. First, we increase the evidence about each element being matched by including evidence from similar elements in the corpus. Second, we learn statistics about elements and their relationships and use them to infer constraints that we use to prune candidate mappings. We also describe how to use known mappings to learn the importance of domain and generic constraints. We present experimental results that demonstrate corpus-based matching outperforms direct matching (without the benefit of a corpus) in multiple domains.
引用
收藏
页码:57 / 68
页数:12
相关论文
共 50 条
  • [1] Applications of corpus-based semantic similarity and word segmentation to database schema matching
    Aminul Islam
    Diana Inkpen
    Iluju Kiringa
    [J]. The VLDB Journal, 2008, 17 : 1293 - 1320
  • [2] Applications of corpus-based semantic similarity and word segmentation to database schema matching
    Islam, Aminul
    Inkpen, Diana
    Kiringa, Iluju
    [J]. VLDB JOURNAL, 2008, 17 (05): : 1293 - 1320
  • [3] Corpus-based interpreting studies as an offshoot of corpus-based translation studies
    Shlesinger, M
    [J]. META, 1998, 43 (04) : 486 - 493
  • [4] A Method for Complex Schema Matching Using Corpus
    Qian, Ying
    Li, Yu-Xiang
    Zhang, Shuai
    Cui, Li
    [J]. 2011 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER SCIENCE AND APPLICATION (FCSA 2011), VOL 1, 2011, : 445 - 448
  • [5] Corpus-based Sociolinguistics
    Partington, Alan
    [J]. INTERNATIONAL JOURNAL OF CORPUS LINGUISTICS, 2015, 20 (02) : 268 - 272
  • [6] Corpus-based sociolinguistics
    Jaworska, Sylvia
    [J]. LANGUAGE IN SOCIETY, 2016, 45 (02) : 308 - 311
  • [7] Corpus-based compositionality
    Garrao, M
    Oliveira, C
    de Freitas, MC
    Dias, MC
    [J]. COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROCEEDINGS, 2006, 3960 : 268 - 271
  • [8] Corpus-based metonymy analysis
    Markert, K
    Nissim, M
    [J]. METAPHOR AND SYMBOL, 2003, 18 (03) : 175 - 188
  • [9] Computational and Corpus-Based Phraseology
    Florido, Francisco Javier Lima
    Pastor, Gloria Corpas
    Mitkov, Ruslan
    [J]. TRANS-REVISTA DE TRADUCTOLOGIA, 2023, (27): : 289 - 293
  • [10] Corpus-based Approaches to ELT
    Curado Fuentes, Alejandro
    [J]. IBERICA, 2011, (21): : 174 - 177