Content-based Analytics: Moving Beyond Data Size

被引:0
|
作者
Tsoumakos, Dimitrios [1 ]
Giannakopoulos, Ioannis [2 ]
机构
[1] Ionian Univ, Dept Informat, Corfu, Greece
[2] NTUA, Sch Elect & Comp Engn, Comp Syst Lab, Athens, Greece
基金
欧盟地平线“2020”;
关键词
D O I
10.1109/BigDataService49289.2020.00013
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Efforts on Big Data technologies have been highly directed towards the amount of data a task can access or crunch. Yet, for content-driven decision making, it is not (only) about the size, but about the "right" data: The number of available datasets (a different type of volume) can reach astronomical sizes, making a thorough evaluation of each input prohibitively expensive. The problem is exacerbated as data sources regularly exhibit varying levels of uncertainty and velocity/churn. To date, there exists no efficient method to quantify the impact of numerous available datasets over different analytics tasks and workflows. This visionary work puts the spotlight on data content rather than size. It proposes a novel modeling, planning and processing research bundle that assesses data quality in terms of analytics performance. The main expected outcome is to provide efficient, continuous and intelligent management and execution of content-driven data analytics. Intelligent dataset selection can achieve massive gains on both accuracy and time required to reach a desired level of performance. This work introduces the notion of utilizing dataset similarity to infer operator behavior and, consequently, be able to build scalable, operator-agnostic performance models for Big Data tasks over different domains. We present an overview of the promising results from our initial work with numerical and graph data and respective operators. We then describe a reference architecture with specific areas of research that need to be tackled in order to provide a data-centric analytics ecosystem.
引用
收藏
页码:33 / 40
页数:8
相关论文
共 50 条
  • [31] Classification of general audio data for content-based retrieval
    Li, DG
    Sethi, IK
    Dimitrova, N
    McGee, T
    [J]. PATTERN RECOGNITION LETTERS, 2001, 22 (05) : 533 - 544
  • [32] Content-Based Retrieval of Distributed Multimedia Conversational Data
    Pallotta, Vincenzo
    [J]. INFORMATION RETRIEVAL AND MINING IN DISTRIBUTED ENVIRONMENTS, 2010, 324 : 183 - 212
  • [33] Content-Based Textual Big Data Analysis and Compression
    Gao, Fei
    Dutta, Ananya
    Liu, Jiangjiang
    [J]. 2018 INTERNATIONAL CONFERENCE ON COMPUTING AND BIG DATA (ICCBD 2018), 2018, : 7 - 12
  • [34] Beyond Keypoints: Novel Techniques for Content-Based Image Matching and Retrieval
    Sluzek, Andrzej
    Yang, Duanduan
    Paradowski, Mariusz
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT I, 2010, 6113 : 555 - 562
  • [35] Automatic moving object extraction toward content-based video representation and indexing
    Fan, JP
    Ji, YC
    Wu, L
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2001, 12 (03) : 306 - 347
  • [36] Content-based retrieval of video data based on spatiotemporal correlation of objects
    Yoshitaka, A
    Hosoda, Y
    Hirakawa, M
    Ichikawa, T
    [J]. IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS, PROCEEDINGS, 1998, : 208 - 213
  • [37] Content-based transformations
    Amatriain, X
    Bonada, J
    Loscos, A
    Arcos, JL
    Verfaille, V
    [J]. JOURNAL OF NEW MUSIC RESEARCH, 2003, 32 (01) : 95 - 114
  • [38] Enhancing the transmission security of content-based hidden biometric data
    Khan, Muhammad Khurram
    Zhang, Jiashu
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2006, PT 3, PROCEEDINGS, 2006, 3973 : 214 - 223
  • [39] Data File Layout Inference Using Content-Based Oracles
    Phillips, Reid A.
    Li, Wing-Ning
    Thompson, Craig
    Deneke, Wesley
    [J]. 2013 IEEE 16TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2013), 2013, : 1029 - 1035
  • [40] Content-based multimedia data access in Internet video communication
    Laier, J
    Panis, S
    Cosmas, JP
    Schaefer, R
    Pearmain, AJ
    [J]. FIRST INTERNATIONAL WORKSHOP ON WIRELESS IMAGE/VIDEO COMMUNICATIONS, 1996, : 126 - 133