Towards a Semantic Extract-Transform-Load (ETL) framework for Big Data Integration

被引:53
|
作者
Bansal, Srividya K. [1 ]
机构
[1] Arizona State Univ, Dept Engn & Comp Syst, Mesa, AZ 85212 USA
关键词
Big data; Data integration; Ontology; Semantic technolgies; DESIGN;
D O I
10.1109/BigData.Congress.2014.82
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Big Data has become the new ubiquitous term used to describe massive collection of datasets that are difficult to process using traditional database and software techniques. Most of this data is inaccessible to users, as we need technology and tools to find, transform, analyze, and visualize data in order to make it consumable for decision-making. One aspect of Big Data research is dealing with the Variety of data that includes various formats such as structured, numeric, unstructured text data, email, video, audio, stock ticker, etc. Managing, merging, and governing a variety of data is the focus of this paper. This paper proposes a semantic Extract-Transform-Load (ETL) framework that uses semantic technologies to integrate and publish data from multiple sources as open linked data. This includes - creation of a semantic data model to provide a basis for integration and understanding of knowledge from multiple sources; creation of a distributed Web of data using Resource Description Framework (RDF) as the graph data model; extraction of useful knowledge and information from the combined data using SPARQL as the semantic query language.
引用
收藏
页码:521 / 528
页数:8
相关论文
共 50 条
  • [21] Generalized Big Data Test Framework for ETL Migration
    Sharma, Kunal
    Attar, Vahida
    [J]. 2016 INTERNATIONAL CONFERENCE ON COMPUTING, ANALYTICS AND SECURITY TRENDS (CAST), 2016, : 528 - 532
  • [22] Towards a conceptual framework for African smart cities semantic data integration
    Guinko, Ferdinand
    Traore, Yaya
    Ben Sta, Hatem
    [J]. 2019 5TH IEEE INTERNATIONAL SMART CITIES CONFERENCE (IEEE ISC2 2019), 2019, : 106 - 111
  • [23] Towards a conceptualization of ETL and physical storage of semantic data warehouses as a service
    Berkani, Nabila
    Bellatreche, Ladjel
    Khouri, Selma
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2013, 16 (04): : 915 - 931
  • [24] Towards a conceptualization of ETL and physical storage of semantic data warehouses as a service
    Nabila Berkani
    Ladjel Bellatreche
    Selma Khouri
    [J]. Cluster Computing, 2013, 16 : 915 - 931
  • [25] A RESTful and semantic framework for data integration
    Fuentes-Lorenzo, Damaris
    Sanchez, Luis
    Cuadra, Antonio
    Cutanda, Mar
    [J]. SOFTWARE-PRACTICE & EXPERIENCE, 2015, 45 (09): : 1161 - 1188
  • [26] Data integration from traditional to big data: main features and comparisons of ETL approaches
    Walha, Afef
    Ghozzi, Faiza
    Gargouri, Faiez
    [J]. JOURNAL OF SUPERCOMPUTING, 2024, : 26687 - 26725
  • [27] NewTL: Engineering an Extract, Transform, Load (ETL) Software System for Business on a Very Large Scale
    Debroy, Vidroha
    Brimble, Lance
    Yost, Matt
    [J]. 33RD ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2018, : 1568 - 1575
  • [28] Significance of Data Integration and ETL in Business Intelligence Framework for Higher Education
    Rodzi, Nur Alia Hamizah Mohamad
    Othman, Mohd Shahizan
    Yusuf, Lizawati Mi
    [J]. 2015 International Conference on Science in Information Technology (ICSITech), 2015, : 181 - 186
  • [29] Using Semantic Web Technologies to Improve the Extract Transform Load Model
    Mahmoud, Amena
    Shams, Mahmoud Y.
    Elzeki, O. M.
    Awad, Nancy Awadallah
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2021, 68 (02): : 2711 - 2726
  • [30] Integration of Big Data Using Semantic Web Technologies
    Ostrowski, David
    Rychtyckyj, Nestor
    MacNeille, Perry
    Kim, Mira
    [J]. 2016 IEEE TENTH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2016, : 381 - 384