Metadata Systems for Data Lakes: Models and Features

被引:18
|
作者
Sawadogo, Pegdwende N. [1 ]
Scholly, Etienne [1 ,2 ]
Favre, Cecile [1 ]
Ferey, Eric [2 ]
Loudcher, Sabine [1 ]
Darmont, Jerome [1 ]
机构
[1] Univ Lyon, Lyon 2, ERIC EA 3083, Lyon, France
[2] BIAL X, Limonest, France
关键词
Data lakes; Metadata modeling; Metadata management; BIG DATA;
D O I
10.1007/978-3-030-30278-8_43
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Over the past decade, the data lake concept has emerged as an alternative to data warehouses for storing and analyzing big data. A data lake allows storing data without any predefined schema. Therefore, data querying and analysis depend on a metadata system that must be efficient and comprehensive. However, metadata management in data lakes remains a current issue and the criteria for evaluating its effectiveness are more or less nonexistent. In this paper, we introduce MEDAL, a generic, graph-based model for metadata management in data lakes. We also propose evaluation criteria for data lake metadata systems through a list of expected features. Eventually, we show that our approach is more comprehensive than existing metadata systems.
引用
收藏
页码:440 / 451
页数:12
相关论文
共 50 条
  • [1] Metadata Management for Data Lakes
    Ravat, Franck
    Zhao, Yan
    [J]. NEW TRENDS IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2019, 2019, 1064 : 37 - 44
  • [2] Metadata Management on Data Processing in Data Lakes
    Megdiche, Imen
    Ravat, Franck
    Zhao, Yan
    [J]. SOFSEM 2021: THEORY AND PRACTICE OF COMPUTER SCIENCE, 2021, 12607 : 553 - 562
  • [3] HANDLE - A Generic Metadata Model for Data Lakes
    Eichler, Rebecca
    Giebler, Corinna
    Groeger, Christoph
    Schwarz, Holger
    Mitschang, Bernhard
    [J]. BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY (DAWAK 2020), 2020, 12393 : 73 - 88
  • [4] Metadata Management for Textual Documents in Data Lakes
    Sawadogo, Pegdwende N.
    Kibata, Tokio
    Darmont, Jerome
    [J]. PROCEEDINGS OF THE 21ST INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS (ICEIS), VOL 1, 2019, : 72 - 83
  • [5] Analysis-oriented Metadata for Data Lakes
    Zhao, Yan
    Aligon, Julien
    Ferrettini, Gabriel
    Megdiche, Imen
    Ravat, Franck
    Soule-Dupuy, Chantal
    [J]. IDEAS 2021: 25TH INTERNATIONAL DATABASE ENGINEERING & APPLICATIONS SYMPOSIUM, 2021, : 194 - 203
  • [6] The Specific Features of Source Data and Metadata Ontology in Virtual Reality Systems
    E. I. Nesterova
    [J]. Automatic Documentation and Mathematical Linguistics, 2019, 53 : 309 - 314
  • [7] The Specific Features of Source Data and Metadata Ontology in Virtual Reality Systems
    Nesterova, E., I
    [J]. AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS, 2019, 53 (06) : 309 - 314
  • [8] Modeling metadata in data lakes-A generic model
    Eichler, Rebecca
    Giebler, Corinna
    Groeger, Christoph
    Schwarz, Holger
    Mitschang, Bernhard
    [J]. DATA & KNOWLEDGE ENGINEERING, 2021, 136
  • [9] Loch Prospector: Metadata Visualization for Lakes of Open Data
    Makhija, Neha
    Jain, Mansi
    Tziavelis, Nikolaos
    Di Rocco, Laura
    Di Bartolomeo, Sara
    Dunne, Cody
    [J]. 2020 IEEE VISUALIZATION CONFERENCE - SHORT PAPERS (VIS 2020), 2020, : 126 - 130
  • [10] Open Metadata for Medical Data Models
    Dugas, Martin
    Breil, Bernhard
    Trinczek, Benjamin
    Stausberg, Juergen
    Kalra, Dipak
    [J]. MEDINFO 2013: PROCEEDINGS OF THE 14TH WORLD CONGRESS ON MEDICAL AND HEALTH INFORMATICS, PTS 1 AND 2, 2013, 192 : 1257 - 1257