Mining Governmental Collaboration Through Semantic Profiling of Open Data Catalogues and Publishers

被引:1
|
作者
Rezk, Mohamed Adel [1 ]
Ojo, Adegboyega [1 ]
Hassan, Islam A. [1 ]
机构
[1] Natl Univ Ireland Galway, Insight Ctr Data Analyt, Galway, Ireland
来源
基金
欧盟地平线“2020”;
关键词
Unstructured data analysis; Data mining; Collaborative network; Open data; E-government;
D O I
10.1007/978-3-319-65151-4_24
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the increasing adoption of open data among governments worldwide especially in the European Union area, a deeper analysis of the newly published data is becoming a mandate. Apart from analyzing the published dataset itself we aimed on analyzing published dataset catalogues. A dataset catalogue or a dataset metadata contains features that describe what the data is about in a textual representation. So, we first acquire data from open data portals, choose descriptive dataset catalogue features, and then construct an aggregated textual representation of the datasets. Afterwards we enrich those textual representations using Natural Language Processing (NLP) methods to create a new comparable data feature "Named Entities". By mining the new data feature we are able to produce datasets and publishers relatedness network. Those networks are used to point similarities between the published data across multiple open data portals. Pointing all possible collaborations for integrating and standardizing data features and types would increase the value of dalta and ease its analysis process.
引用
收藏
页码:253 / 264
页数:12
相关论文
共 28 条
  • [21] The power of many brains: Catalyzing neuropsychiatric discovery through open neuroimaging data and large-scale collaboration
    Lu, Bin
    Chen, Xiao
    Castellanos, Francisco Xavier
    Thompson, Paul M.
    Zuo, Xi-Nian
    Zang, Yu-Feng
    Yan, Chao -Gan
    SCIENCE BULLETIN, 2024, 69 (10) : 1536 - 1555
  • [22] More eyes on the prize: open-source data, software and hardware for advancing plant science through collaboration
    Coleman, Guy R. Y.
    Salter, William T.
    AOB PLANTS, 2023, 15 (02):
  • [23] Semantic Analysis and Topic Modelling of Web-Scrapped COVID-19 Tweet Corpora through Data Mining Methodologies
    Gourisaria, Mahendra Kumar
    Chandra, Satish
    Das, Himansu
    Patra, Sudhansu Shekhar
    Sahni, Manoj
    Leon-Castro, Ernesto
    Singh, Vijander
    Kumar, Sandeep
    HEALTHCARE, 2022, 10 (05)
  • [24] Health Care Transformation Through Collaboration on Open-Source Informatics Projects: Integrating a Medical Applications Platform, Research Data Repository, and Patient Summarization
    Klann, Jeffrey G.
    Mccoy, Allison B.
    Wright, Adam
    Wattanasin, Nich
    Sittig, Dean F.
    Murphy, Shawn N.
    INTERACTIVE JOURNAL OF MEDICAL RESEARCH, 2013, 2 (01): : 66 - 77
  • [25] The power of many brains: Catalyzing neuropsychiatric discovery through open neuroimaging data and large-scale collaboration( vol 69 , pg 1536 , 2024)
    Lu, Bin
    Chen, Xiao
    Castellanos, Francisco Xavier
    Thompson, Paul M.
    Zuo, Xi-Nian
    Zang, Yu-Feng
    Yan, Chao-Gan
    SCIENCE BULLETIN, 2024, 69 (17) : 2793 - 2793
  • [26] Gene expression profiling of 49 human tumor xenografts from in vitro culture through multiple in vivo passages - strategies for data mining in support of therapeutic studies
    Melinda G Hollingshead
    Luke H Stockwin
    Sergio Y Alcoser
    Dianne L Newton
    Benjamin C Orsburn
    Carrie A Bonomi
    Suzanne D Borgel
    Raymond Divelbiss
    Kelly M Dougherty
    Elizabeth J Hager
    Susan L Holbeck
    Gurmeet Kaur
    David J Kimmel
    Mark W Kunkel
    Angelena Millione
    Michael E Mullendore
    Howard Stotler
    Jerry Collins
    BMC Genomics, 15
  • [27] Gene expression profiling of 49 human tumor xenografts from in vitro culture through multiple in vivo passages - strategies for data mining in support of therapeutic studies
    Hollingshead, Melinda G.
    Stockwin, Luke H.
    Alcoser, Sergio Y.
    Newton, Dianne L.
    Orsburn, Benjamin C.
    Bonomi, Carrie A.
    Borgel, Suzanne D.
    Divelbiss, Raymond
    Dougherty, Kelly M.
    Hager, Elizabeth J.
    Holbeck, Susan L.
    Kaur, Gurmeet
    Kimmel, David J.
    Kunkel, Mark W.
    Millione, Angelena
    Mullendore, Michael E.
    Stotler, Howard
    Collins, Jerry
    BMC GENOMICS, 2014, 15
  • [28] PV-OWL - PharmacoVigilance surveillance thrOugh semantic Web-based pLatform for continuous and integrated monitoring of drug-related adverse effects in open data sources and Social media
    Piccinni, Carlo
    Orsini, Mirko
    Poluzzi, Elisabetta
    Bergamaschi, Sonia
    2017 IEEE 3RD INTERNATIONAL FORUM ON RESEARCH AND TECHNOLOGIES FOR SOCIETY AND INDUSTRY (RTSI), 2017, : 516 - 520