Data lake governance using IBM-Watson knowledge catalog

被引:0
|
作者
Cherradi, Mohamed [1 ]
Bouhafer, Fadwa [1 ]
EL Haddadi, Anass [1 ]
机构
[1] Abdelmalek Essaadi Univ UAE Tetouan, Data Sci & Competet Intelligence Team DSCI, ENSAH, Tetouan, Morocco
关键词
Data lake; Big data; Data catalog; Information retrieval; FAIR principles; IBM-WKC; BIG DATA;
D O I
10.1016/j.sciaf.2023.e01854
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The strategic importance of data in decision-making is increasingly recognized, demanding efficient solutions such as data catalogs to ensure data governance and emphasize data interoperability, in accordance with the FAIR (Findable, Accessible, Interoperable, and Reusable) principles. However, the usage of FAIR-compliant data catalogs lacks empirical studies due to its novelty. This study aims to promote the practical adoption of data catalogs as a means to manage the expanding data landscape. We differentiate our contribution by providing an empirical evaluation and comparison of IBM Watson Knowledge Catalog (IBM-WKC), a leading data cataloging solution, with two other prominent alternatives, Open-Metadata and Data-Galaxy, for extracting relevant information from data lakes containing heterogeneous data sources in their native formats. Our proposed methodology utilizes an innovative tool built on IBM-WKC for annotating collected documents. To evaluate our approach, we conducted experiments on a dataset of 100 documents sourced from scientific databases. Moreover, to assess our proposal, we compare the retrieved text to the appropriate interventions that use the original checklist. The results demonstrate the superiority of IBM-WKC over its competitors, showcasing its enhanced performance in addressing data cataloging challenges. Notably, the tested queries achieved an impressive accuracy, precision, and recall value of 96%. These findings highlight the reliability and alignment of IBM-WKC with the FAIR principles.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Simulation Model of IBM-Watson Intelligent System for Early Software Development
    Abrar, Saif Syed
    Arumugam, Rajesh K.
    UKSIM-AMSS 15TH INTERNATIONAL CONFERENCE ON COMPUTER MODELLING AND SIMULATION (UKSIM 2013), 2013, : 325 - 329
  • [2] Preclinical efficacy of drugs identified by IBM-Watson for repurposing to treat L-DOPA-induced dyskinesia
    Visanji, N.
    Lacoste, A.
    Ravenscroft, P.
    Spangler, S.
    Fox, S.
    Lang, A.
    Brotchie, J.
    Johnston, T.
    MOVEMENT DISORDERS, 2019, 34 : S94 - S94
  • [3] IBM Data Governance solutions
    Wrobel, Andrzej
    Komnata, Konrad
    Rudek, Krzysztof
    PROCEEDINGS OF 4TH INTERNATIONAL CONFERENCE ON BEHAVIORAL, ECONOMIC ADVANCE IN BEHAVIORAL, ECONOMIC, SOCIOCULTURAL COMPUTING (BESC), 2017,
  • [4] Using the IBM Watson Cognitive System in Educational Contexts
    Kollia, Ilianna
    Siolas, Georgios
    PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,
  • [5] Implementation of Chatbot for ITSM Application using IBM Watson
    Godse, Neha Atul
    Deodhar, Shaunak
    Raut, Shubhangi
    Jagdale, Pranjali
    2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,
  • [6] Timestamp Anomaly Detection Using IBM Watson IoT Platform
    Katiyar, Aditi
    Aktar, Neha
    Mayank
    Lavanya, K.
    SOFT COMPUTING FOR PROBLEM SOLVING, SOCPROS 2018, VOL 2, 2020, 1057 : 771 - 782
  • [7] Using Knowledge Graphs to Search an Enterprise Data Lake
    Schmid, Stefan
    Henson, Cory
    Tran, Tuan
    SEMANTIC WEB: ESWC 2019 SATELLITE EVENTS, 2019, 11762 : 262 - 266
  • [8] Identification of glomerulosclerosis using IBM Watson and shallow neural networks
    Francesco Pesce
    Federica Albanese
    Davide Mallardi
    Michele Rossini
    Giuseppe Pasculli
    Paola Suavo-Bulzis
    Antonio Granata
    Antonio Brunetti
    Giacomo Donato Cascarano
    Vitoantonio Bevilacqua
    Loreto Gesualdo
    Journal of Nephrology, 2022, 35 : 1235 - 1242
  • [9] Identification of glomerulosclerosis using IBM Watson and shallow neural networks
    Pesce, Francesco
    Albanese, Federica
    Mallardi, Davide
    Rossini, Michele
    Pasculli, Giuseppe
    Suavo-Bulzis, Paola
    Granata, Antonio
    Brunetti, Antonio
    Cascarano, Giacomo Donato
    Bevilacqua, Vitoantonio
    Gesualdo, Loreto
    JOURNAL OF NEPHROLOGY, 2022, 35 (04) : 1235 - 1242
  • [10] Secure End to End Communications and Data Analytics in IoT Integrated Application Using IBM Watson IoT Platform
    Ahmed, Mohammed Imtyaz
    Kannan, G.
    WIRELESS PERSONAL COMMUNICATIONS, 2021, 120 (01) : 153 - 168