Comparative Evaluation of Big-Data Systems on Scientific Image Analytics Workloads

被引:29
|
作者
Mehta, Parmita [1 ]
Dorkenwald, Sven [1 ]
Zhao, Dongfang [1 ]
Kaftan, Tomer [1 ]
Cheung, Alvin [1 ]
Balazinska, Magdalena [1 ]
Rokem, Ariel [1 ]
Connolly, Andrew [1 ]
Vanderplas, Jacob [1 ]
AlSayyad, Yusra [1 ]
机构
[1] Univ Washington, Seattle, WA 98195 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2017年 / 10卷 / 11期
基金
美国国家科学基金会;
关键词
D O I
10.14778/3137628.3137634
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Scientific discoveries are increasingly driven by analyzing large volumes of image data. Many new libraries and specialized database management systems (DBMSs) have emerged to support such tasks. It is unclear how well these systems support real-world image analysis use cases, and how performant the image analytics tasks implemented on top of such systems are. In this paper, we present the first comprehensive evaluation of large-scale image analysis systems using two real-world scientific image data processing use cases. We evaluate five representative systems (SciDB, Myria, Spark, Dask, and TensorFlow) and find that each of them has shortcomings that complicate implementation or hurt performance. Such shortcomings lead to new research opportunities in making large-scale image analysis both efficient and easy to use.
引用
收藏
页码:1226 / 1237
页数:12
相关论文
共 50 条
  • [1] Evolutionary Scheduling of Dynamic Multitasking Workloads for Big-Data Analytics in Elastic Cloud
    Zhang, Fan
    Cao, Junwei
    Tan, Wei
    Khan, Samee U.
    Li, Keqin
    Zomaya, Albert Y.
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2014, 2 (03) : 338 - 351
  • [2] Proxy Benchmarks for Emerging Big-data Workloads
    Panda, Reena
    John, Lizy Kurian
    [J]. 2017 26TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT), 2017, : 105 - 116
  • [3] Proxy Benchmarks for Emerging Big-data Workloads
    Panda, Reena
    John, Lizy Kurian
    [J]. 2017 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS), 2017, : 139 - 140
  • [4] Enabling Scientific Data Storage and Processing on Big-data Systems
    Biookaghazadeh, Saman
    Xu, Yiqi
    Zhou, Shujia
    Zhao, Ming
    [J]. PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 1978 - 1984
  • [5] Advancing manufacturing systems with big-data analytics: A conceptual framework
    Kozjek, Dominik
    Vrabic, Rok
    Rihtarsic, Borut
    Lavrac, Nada
    Butala, Peter
    [J]. INTERNATIONAL JOURNAL OF COMPUTER INTEGRATED MANUFACTURING, 2020, 33 (02) : 169 - 188
  • [6] "I-Care" - Big-data Analytics for Intelligent Systems
    Singh, Paras Nath
    [J]. 2021 8TH INTERNATIONAL CONFERENCE ON SMART COMPUTING AND COMMUNICATIONS (ICSCC), 2021, : 225 - 229
  • [7] Sports analytics and the big-data era
    Morgulev E.
    Azar O.H.
    Lidor R.
    [J]. International Journal of Data Science and Analytics, 2018, 5 (4) : 213 - 222
  • [8] Kaleido: Enabling Efficient Scientific Data Processing on Big-Data Systems
    Biookaghazadeh, Saman
    Zhou, Shujia
    Zhao, Ming
    [J]. 2017 INTERNATIONAL CONFERENCE ON NETWORKING, ARCHITECTURE, AND STORAGE (NAS), 2017, : 121 - 130
  • [9] Leveraging big-data for business process analytics
    Vera-Baquero, Alejandro
    Palacios, Ricardo Colomo
    Stantchev, Vladimir
    Molloy, Owen
    [J]. LEARNING ORGANIZATION, 2015, 22 (04): : 215 - 228
  • [10] BigCache for Big-data Systems
    Roger, Michel Angelo
    Xu, Yiqi
    Zhao, Ming
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014, : 189 - 194