A Content-Based Approach for Modeling Analytics Operators

被引:1
|
作者
Giannakopoulos, Ioannis [1 ]
Tsoumakos, Dimitrios [2 ]
Koziris, Nectarios [1 ]
机构
[1] Natl Tech Univ Athens, Comp Syst Lab, Sch ECE, Athens, Greece
[2] Ionian Univ, Dept Informat, Corfu, Greece
关键词
D O I
10.1145/3269206.3271731
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The plethora of publicly available data sources has given birth to a wealth of new needs and opportunities. The ever increasing amount of data has shifted the analysts' attention from optimizing the operators for specific business cases, to focusing on datasets per se, selecting the ones that are most suitable for specific operators, i.e., they make an operator produce a specific output. Yet, predicting the output of a given operator executed for different input datasets is not an easy task: It entails executing the operator for all of them, something that requires excessive computational power and time. To tackle this challenge, we propose a novel dataset profiling methodology that infers an operator's outcome based on examining the similarity of the available input datasets in specific attributes. Our methodology quantifies dataset similarities and projects them into a low-dimensional space. The operator is then executed for a mere subset of the available datasets and its output for the rest of them is approximated using Neural Networks trained using this space as input. Our experimental evaluation thoroughly examines the performance of our scheme using both synthetic and real-world datasets, indicating that the suggested approach is capable of predicting an operator's output with high accuracy. Moreover, it massively accelerates operator profiling in comparison to approaches that require an exhaustive operator execution, rendering our work ideal for cases where a multitude of operators need to be executed to a set of given datasets.
引用
收藏
页码:227 / 236
页数:10
相关论文
共 50 条
  • [41] A CONTENT-BASED APPROACH FOR SALIENCY ESTIMATION IN 360 IMAGES
    Mazumdar, Pramit
    Battisti, Federica
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 3197 - 3201
  • [42] A efficient approach for content-based color image retrieval
    Gong, SR
    Xiong, Z
    Sun, WY
    [J]. PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN & COMPUTER GRAPHICS, 1999, : 1258 - 1262
  • [43] An effective approach towards content-based image retrieval
    Missaoui, R
    Sarifuddin, M
    Vaillancourt, J
    [J]. IMAGE AND VIDEO RETRIEVAL, PROCEEDINGS, 2004, 3115 : 335 - 343
  • [44] Content-based Approach for Vietnamese Spam SMS Filtering
    Pham, Thai-Hoang
    Le-Hong, Phuong
    [J]. PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2016, : 41 - 44
  • [45] A content-based approach for detecting highlights in action movies
    Mei-Chen Yeh
    Yen-Wei Tsai
    Hao-Chen Hsu
    [J]. Multimedia Systems, 2016, 22 : 287 - 295
  • [46] A novel fusion approach to content-based image retrieval
    Qi, XJ
    Han, YT
    [J]. PATTERN RECOGNITION, 2005, 38 (12) : 2449 - 2465
  • [47] An Approach to Art Collections Management and Content-based Recovery
    de Celis Herrero, Concepcion Perez
    Lara Alvarez, Jaime
    Cossio Aguilar, Gustavo
    Somodevilla Garcia, Maria J.
    [J]. JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2011, 7 (03): : 447 - 458
  • [48] A content-based approach for detecting highlights in action movies
    Yeh, Mei-Chen
    Tsai, Yen-Wei
    Hsu, Hao-Chen
    [J]. MULTIMEDIA SYSTEMS, 2016, 22 (03) : 287 - 295
  • [49] A new approach for combining content-based and collaborative filters
    Kim, Byeong Man
    Li, Qing
    Park, Chang Seok
    Kim, Si Gwan
    Kim, Ju Yeon
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2006, 27 (01) : 79 - 91
  • [50] A flexible content-based approach to adaptive image compression
    Wong, Alexander
    Bishop, William
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO - ICME 2006, VOLS 1-5, PROCEEDINGS, 2006, : 713 - +