Content-based Analytics: Moving Beyond Data Size

被引:0
|
作者
Tsoumakos, Dimitrios [1 ]
Giannakopoulos, Ioannis [2 ]
机构
[1] Ionian Univ, Dept Informat, Corfu, Greece
[2] NTUA, Sch Elect & Comp Engn, Comp Syst Lab, Athens, Greece
基金
欧盟地平线“2020”;
关键词
D O I
10.1109/BigDataService49289.2020.00013
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Efforts on Big Data technologies have been highly directed towards the amount of data a task can access or crunch. Yet, for content-driven decision making, it is not (only) about the size, but about the "right" data: The number of available datasets (a different type of volume) can reach astronomical sizes, making a thorough evaluation of each input prohibitively expensive. The problem is exacerbated as data sources regularly exhibit varying levels of uncertainty and velocity/churn. To date, there exists no efficient method to quantify the impact of numerous available datasets over different analytics tasks and workflows. This visionary work puts the spotlight on data content rather than size. It proposes a novel modeling, planning and processing research bundle that assesses data quality in terms of analytics performance. The main expected outcome is to provide efficient, continuous and intelligent management and execution of content-driven data analytics. Intelligent dataset selection can achieve massive gains on both accuracy and time required to reach a desired level of performance. This work introduces the notion of utilizing dataset similarity to infer operator behavior and, consequently, be able to build scalable, operator-agnostic performance models for Big Data tasks over different domains. We present an overview of the promising results from our initial work with numerical and graph data and respective operators. We then describe a reference architecture with specific areas of research that need to be tackled in order to provide a data-centric analytics ecosystem.
引用
收藏
页码:33 / 40
页数:8
相关论文
共 50 条
  • [1] Big Data Analytics: Deep Content-Based Prediction with Sampling Perspective
    Albattah, Waleed
    Albahli, Saleh
    [J]. Computer Systems Science and Engineering, 2023, 45 (01): : 531 - 544
  • [2] A Content-Based Approach for Modeling Analytics Operators
    Giannakopoulos, Ioannis
    Tsoumakos, Dimitrios
    Koziris, Nectarios
    [J]. CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, : 227 - 236
  • [3] Content-Based Multimedia Analytics: US and NATO Research
    Bowman, Elizabeth K.
    Burghouts, Gertj An
    Overher, Lasse
    Kase, Sue E.
    Zimmerman, Randal J.
    Oggero, Serena
    [J]. NEXT-GENERATION ANALYST VI, 2018, 10653
  • [4] Extraction of moving objects for content-based video coding
    Meier, T
    Ngan, KN
    [J]. VISUAL COMMUNICATIONS AND IMAGE PROCESSING '99, PARTS 1-2, 1998, 3653 : 1178 - 1189
  • [5] Automatic moving object extraction or content-based applications
    Xu, HF
    Younis, AA
    Kabuka, MR
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2004, 14 (06) : 796 - 812
  • [6] Moving Beyond Analytics to Intelligence
    Chokshi, Dave A.
    Katz, Mitchell H.
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2020, 323 (20): : 1997 - 1997
  • [7] Content-based Interoperability: Beyond Technical Specifications of Interfaces
    Schweizer, Tobias
    Rosenthaler, Lukas
    Fornaro, Peter
    [J]. ARCHIVING 2017: FINAL PROGRAM AND PROCEEDINGS, 2017, : 34 - 38
  • [8] Segmentation and tracking of moving objects for content-based video coding
    Meier, T
    Ngan, KN
    [J]. IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1999, 146 (03): : 144 - 150
  • [9] Content-based video retrieval using moving objects' trajectories
    Shim, CB
    Chang, JW
    [J]. KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 2, PROCEEDINGS, 2004, 3214 : 975 - 981
  • [10] Content-based filtering system for music data
    Iwahama, K
    Hijikata, Y
    Nishida, S
    [J]. 2004 INTERNATIONAL SYMPOSIUM ON APPLICATIONS AND THE INTERNET WORKSHOPS, PROCEEDINGS, 2004, : 480 - 487