Automated curation of spatial metadata in environmental monitoring data

被引:0
|
作者
Mutlu, Ilhan [1 ]
Hackermueller, Joerg [1 ,2 ]
Schor, Jana [1 ,2 ]
机构
[1] UFZ Helmholtz Ctr Environm Res, Dept Computat Biol & Chem, D-04318 Leipzig, Germany
[2] Univ Leipzig, Fac Math & Comp Sci, Dept Comp Sci, D-04109 Leipzig, Germany
关键词
Environmental monitoring; Spatial data accuracy; Automated data curation; Big data analytics; AI applications in hydrology;
D O I
10.1016/j.ecoinf.2025.103038
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Spatial data accuracy in environmental monitoring is crucial for practical large-scale data analytics and the development of AI models. In this context, spatial data is metadata and faces the same challenges as any other metadata, like missing values, false or contradicting information, formatting problems of textual data and numbers, the usage of different languages, and more. These issues severely limit the usability of the data. With this study, we provide an automatic approach, CleanGeoStreamR, to resolve as many of these issues as possible for the spatially annotated environmental monitoring database. We substantially increased the quality of the spatial metadata and, therefore, the quantity of data points that can be used in large-scale data analytics and AI applications. Further, our goal is to raise awareness about the issues related to spatial metadata and promote the implementation of our concepts in other environmental monitoring data sources. Advanced understanding and the availability of automatic approaches like the presented method will substantially contribute to making environmental monitoring data FAIR and enhance its usability in the era of Big Data and AI.
引用
收藏
页数:7
相关论文
共 50 条
  • [11] Guest editorial: large-scale data curation and metadata management
    Mohamed Eltabakh
    Boris Glavic
    Distributed and Parallel Databases, 2018, 36 : 5 - 8
  • [12] Data Curation: Improving Environmental Health Data Quality
    Yang, Lin
    Li, Jiao
    Hou, Li
    Qian, Qing
    MEDINFO 2015: EHEALTH-ENABLED HEALTH, 2015, 216 : 1061 - 1061
  • [13] Extension of spatial metadata for navigating distributed spatial data
    Luo, YW
    Wang, XL
    Xu, ZQ
    IGARSS 2003: IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOLS I - VII, PROCEEDINGS: LEARNING FROM EARTH'S SHAPES AND SIZES, 2003, : 3721 - 3723
  • [14] AutoCure: Automated Tabular Data Curation for ML Pipelines
    Abdelaal, Mohamed
    Koparde, Rashmi
    Schoening, Harald
    PROCEEDINGS OF THE SIXTH INTERNATIONAL WORKSHOP ON EXPLOITING ARTIFICIAL INTELLIGENCE TECHNIQUES FOR DATA MANAGEMENT, AIDM 2023, 2023,
  • [15] Correcting inconsistencies and errors in bacterial genome metadata using an automated curation tool in Excel (AutoCurE)
    Schmedes, Sarah E.
    King, Jonathan L.
    Budowle, Bruce
    FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2015, 3
  • [16] CONTROLLABLE AUTOMATED ENVIRONMENTAL DATA ACQUISITION AND MONITORING-SYSTEM
    DIMMLER, DG
    GREENLAW, N
    RANKOWITZ, S
    IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 1976, 23 (01) : 748 - 756
  • [17] Encouraging metadata curation in the Diversity Seek initiative
    Meyer, Rachel S.
    NATURE PLANTS, 2015, 1 (07)
  • [18] Ascending the Pyramid Librarians, Metadata and the Curation of Culture
    Gartner, Richard
    IEEE 5TH INTERNATIONAL SYMPOSIUM ON EMERGING TRENDS AND TECHNOLOGIES IN LIBRARIES AND INFORMATION SERVICES (ETTLIS 2018), 2018, : 283 - 285
  • [19] Encouraging metadata curation in the Diversity Seek initiative
    Rachel S. Meyer
    Nature Plants, 1 (7)
  • [20] Geodata. Spatial data in metadata creation
    Kempf, Klaus
    JLIS.IT, 2023, 14 (02): : 53 - 63