Data placement for scientific applications in distributed environments

被引:0
|
作者
Chervenak, Ann [1 ]
Deelman, Ewa [1 ]
Livny, Miron [2 ]
Su, Mei-Hui [1 ]
Schuler, Rob [1 ]
Bharathi, Shishir [1 ]
Mehta, Gaurang [1 ]
Vahi, Karan [1 ]
机构
[1] Univ So Calif, Inst Informat Sci, Marina Del Rey, CA 90292 USA
[2] Univ Wisconsin, Dept Comp Sci, Madison, WI 53706 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Scientific applications often perform complex computational analyses that consume and produce large data sets. We are concerned with data placement policies that distribute data in ways that are advantageous for application execution, for example, by placing data sets so that they may be staged into or out of computations efficiently or by replicating them for improved performance and reliability. In particular, we propose to study the relationship between data placement services and workflow management systems. In this paper, we explore the interactions between two services used in large-scale science today. We evaluate the benefits of prestaging data using the Data Replication Service versus using the native data stage-in mechanisms of the Pegasus workflow management system. We use the astronomy application, Montage, for our experiments and modify it to study the effect of input data size on the benefits of data prestaging. As the size of input data sets increases, prestaging using a data placement service can significantly improve the performance of the overall analysis.
引用
收藏
页码:146 / +
页数:2
相关论文
共 50 条
  • [1] Driving scientific applications by data in distributed environments
    Saltz, J
    Catalyurek, U
    Kurc, T
    Gray, M
    Hastings, S
    Langella, S
    Narayanan, S
    Martino, R
    Bryant, S
    Peszynska, M
    Wheeler, M
    Sussman, A
    Beynon, M
    Hansen, C
    Stredney, D
    Sessanna, D
    COMPUTATIONAL SCIENCE - ICCS 2003, PT IV, PROCEEDINGS, 2003, 2660 : 355 - 364
  • [2] Efficient Location-Aware Data Placement for Data-Intensive Applications in Geo-distributed Scientific Data Centers
    Zhang, Jinghui
    Chen, Jian
    Luo, Junzhou
    Song, Aibo
    TSINGHUA SCIENCE AND TECHNOLOGY, 2016, 21 (05) : 471 - 481
  • [3] Efficient Location-Aware Data Placement for Data-Intensive Applications in Geo-distributed Scientific Data Centers
    Jinghui Zhang
    Jian Chen
    Junzhou Luo
    Aibo Song
    Tsinghua Science and Technology, 2016, 21 (05) : 471 - 481
  • [4] A Scientific Data Provenance Harvester for Distributed Applications
    Stephan, Eric
    Raju, Bibi
    Elsethagen, Todd
    Pouchard, Line
    Gamboa, Carlos
    2017 NEW YORK SCIENTIFIC DATA SUMMIT (NYSDS), 2017,
  • [5] A Scientific Data Provenance API for Distributed Applications
    Raju, Bibi
    Elsethagen, Todd
    Stephan, Eric
    Van Dam, Kerstin Kleese
    2016 INTERNATIONAL CONFERENCE ON COLLABORATION TECHNOLOGIES AND SYSTEMS (CTS), 2016, : 104 - 111
  • [6] A New Data Placement Approach for Scientific Workflows in Cloud Computing Environments
    Kchaou, Hamdi
    Kechaou, Zied
    Alimi, Adel M.
    INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA 2016), 2017, 557 : 330 - 340
  • [7] Towards Predictive Replica Placement for Distributed Data Stores in Fog Environments
    Pfandzelter, Tobias
    Bermbach, David
    2021 IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING, IC2E 2021, 2021, : 280 - 281
  • [8] Efficient Operator Placement for Distributed Data Stream Processing Applications
    Nardelli, Matteo
    Cardellini, Valeria
    Grassi, Vincenzo
    Lo Presti, Francesco
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (08) : 1753 - 1767
  • [9] An Adaptive Data Placement Strategy in scientific workflows over Cloud Computing Environments
    Kim, Heewon
    Kim, Yoonhee
    NOMS 2018 - 2018 IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM, 2018,
  • [10] A resilient methodology for accessing and exploiting data and scientific codes on distributed environments
    Rodriguez-Pascual, M.
    LaRocca, G.
    Kanellopoulo, C.
    Carrubba, C.
    Inserra, G.
    Ricceri, R.
    Asorey, H.
    Rubio-Montero, A.
    Nunez-Gonzalez, E.
    Nunez, L. A.
    Prnjat, O.
    Barbera, R.
    Mayo-Garcia, R.
    2015 IEEE 18TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE), 2015, : 319 - 323