Survey and Prospect: Data Integration Methodologies

被引:0
|
作者
Wang S. [1 ]
Peng Y.-W. [1 ]
Lan H. [1 ]
Luo Q.-W. [1 ]
Peng Z.-Y. [1 ]
机构
[1] School of Computer Science, Wuhan University, Wuhan
来源
Ruan Jian Xue Bao/Journal of Software | 2020年 / 31卷 / 03期
基金
国家重点研发计划;
关键词
Big data; Crowdsourcing; Data integration; Data management; Web table;
D O I
10.13328/j.cnki.jos.005911
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Data integration plays a very important role in data management and analytical area. Although there have been decades since the data integration problem was first proposed, there are many data integration problems that remain unsolved. This study surveys the works in data integration area from 2001 until now. By categorizing these papers and their methodologies, it is able to summarize how these works develop and how their research topics shift from time to time. Several research topics are also filtered out that draw much attention recently and hopefully the survey and conclusions may provide guidance to the related researchers. © Copyright 2020, Institute of Software, the Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:893 / 908
页数:15
相关论文
共 43 条
  • [1] Du X.Y., Lu W., Zhang F., History, present, and future of big data management systems, Ruan Jian Xue Bao/Journal of Software, 30, 1, pp. 127-141, (2019)
  • [2] Meng X.F., Du Z.J., Research on the big data fusion: Issues and challenges, Journal of Computer Research and Development, 53, 2, pp. 231-246, (2016)
  • [3] Chen Y.G., Wang J.C., A review of data integration, Computer Science, 31, 5, pp. 48-51, (2004)
  • [4] Yang X.D., Peng Z.Y., Liu J.Q., Et al., An overview of information integration, Computer Science, 33, 7, pp. 55-59, (2006)
  • [5] Rahm E., Bernstein P.A., A survey of approaches to automatic schema matching, The VLDB Journal, 10, 4, pp. 334-350, (2001)
  • [6] Berlin J., Motro A., Database schema matching using machine learning with feature selection, Proc. of the Int'l Conf. on Advanced Information Systems Engineering, pp. 452-466, (2002)
  • [7] Kang J., Naughton J.F., On schema matching with opaque column names and data values, Proc. of the 2003 ACM SIGMOD Int'l Conf. on Management of Data, pp. 205-216, (2003)
  • [8] He B., Chang K.C.C., Statistical schema matching across Web query interfaces, Proc. of the 2003 ACM SIGMOD Int'l Conf. on Management of Data, pp. 217-228, (2003)
  • [9] Melnik S., Garcia-Molina H., Rahm E., Similarity flooding: A versatile graph matching algorithm and its application to schema matching, Proc. of the 18th Int'l Conf. on Data Engineering, pp. 117-128, (2002)
  • [10] Dhamankar R., Lee Y., Doan A.H., Et al., iMAP: Discovering complex semantic matches between database schemas, Proc. of the 2004 ACM SIGMOD Int'l Conf. on Management of Data, pp. 383-394, (2004)