HDSAnalytics: A Data Analytics Framework for Heterogeneous Data Sources

被引:4
|
作者
Jaybal, Yogalakshmi [1 ]
Ramanathan, Chandrashekar [1 ]
Rajagopalan, S. [1 ]
机构
[1] Int Inst Informat Technol, Bangalore, Karnataka, India
关键词
heterogeneous data sources; analytics; query processing; INTEGRATION;
D O I
10.1145/3152494.3152516
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper presents HDSAnalytics: A data analytics framework for heterogeneous data sources. This framework utilizes data from a variety of data sources differing in formats and volume. These data sources can contain data in structured, semi-structured or unstructured form. The integration of data from these different data sources into a single unified data source may result in some loss of information due to semantic, syntactic and schematic differences that arise among data sources. Semantic heterogeneity arises because of the presence of similar data in different forms in different data sources. Schematic and Syntactic heterogeneity arises due to the difference in formats/schema in which the data is stored and the way in which the data is accessed and retrieved respectively. Hence, the need to access, retrieve and utilize the information from different data sources possess challenges like 1. Mapping similar attributes among different data sources, 2. Retrieving specific attributes from different data sources that are relevant with respect to a users query, 3. Retrieving data from different data sources in different formats as requested by different components in the system. The proposed HDSAnalytics framework design aides analytic models in using heterogeneous data sources "As -Is" without integrating into a single data source, thereby overcoming all the above mentioned challenges. Our prototype of the framework, experimented using data from Bangalore Metropolitan Transport Corporation (BMTC), India, demonstrates how bus fleet operations can be smoothly analyzed, diagnosed and explored for improving bus fleet schedules and reducing the operations costs. It provides detailed insight on bus fleet operations. Our prototype scales and works efficiently well with increasing number of heterogeneous data sources.
引用
收藏
页码:11 / 19
页数:9
相关论文
共 50 条
  • [41] IoT Data Analytics in Retail: Framework and Implementation
    Grabis, Janis
    Jegorova, Kristina
    Pinka, Krisjanis
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INNOVATIVE INTELLIGENT INDUSTRIAL PRODUCTION AND LOGISTICS (IN4PL), 2020, : 93 - 100
  • [42] An Integrated Visual Analytics Framework for Spatiotemporal Data
    Wang, Shaohua
    Zhong, Ershun
    Zhou, Qiang
    Cui, Xue
    Lu, Hao
    Yun, Weiying
    Hu, Zhongnan
    Cai, Wenwen
    Long, Liang
    [J]. PROCEEDINGS OF THE 1ST ACM SIGSPATIAL INTERNATIONAL WORKSHOP ON ADVANCES IN RESILIENT AND INTELLIGENT CITIES (ARIC-2018), 2018, : 41 - 45
  • [43] Framework of Data Analytics and Integrating Knowledge Management
    Schaefer C.
    Makatsaria A.
    [J]. International Journal of Intelligent Networks, 2021, 2 : 156 - 165
  • [44] A Conceptual Framework for HPC Operational Data Analytics
    Netti, Alessio
    Shin, Woong
    Ott, Michael
    Wilde, Torsten
    Bates, Natalie
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2021), 2021, : 596 - 603
  • [45] A Secure Framework for mHealth Data Analytics with Visualization
    Ferebee, Denise
    Shandilya, Vivek
    Wu, Chase
    Ricks, Janet
    Agular, David
    Cole, Karyn
    Ray, Byron
    Franklin, Aukii
    Titon, Candice
    Wang, Zongmin
    [J]. 2016 IEEE 35TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2016,
  • [46] A Visual Analytics Framework for Big Spatiotemporal Data
    Wang, Shaohua
    Zhong, Ershun
    Cai, Wenwen
    Zhou, Qiang
    Lu, Hao
    Gu, Yongquan
    Yun, Weiying
    Hu, Zhongnan
    Long, Liang
    [J]. PROCEEDINGS OF THE 2ND ACM SIGSPATIAL INTERNATIONAL WORKSHOP ON ANALYTICS FOR LOCAL EVENTS AND NEWS (LENS 2018), 2018,
  • [47] Visual Analytics Framework for Cloud Infrastructure Data
    Kejariwal, Arun
    Lee, Winston
    Vallis, Owen
    Hochenbaum, Jordan
    Yan, Bryce
    [J]. 2013 IEEE 16TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2013), 2013, : 886 - 893
  • [48] Big data framework for analytics in smart grids
    Munshi, Amr A.
    Mohamed, Yasser A. -R. I.
    [J]. ELECTRIC POWER SYSTEMS RESEARCH, 2017, 151 : 369 - 380
  • [49] BigProvision: A Provisioning Framework for Big Data Analytics
    Li, Huan
    Lu, Kejie
    Meng, Shicong
    [J]. IEEE NETWORK, 2015, 29 (05): : 50 - 56
  • [50] Apache Wayang: A Unified Data Analytics Framework
    Beedkar, Kaustubh
    Contreras-Rojas, Bertty
    Gavriilidis, Haralampos
    Kaoudi, Zoi
    Markl, Volker
    Pardo-Meza, Rodrigo
    Quiane-Ruiz, Jorge-Arnulfo
    [J]. SIGMOD RECORD, 2023, 52 (03) : 30 - 35