Understanding Data Access Patterns for dCache System

被引:0
|
作者
Bellavita, Julian [1 ]
Sim, Caitlin [1 ]
Wu, Kesheng [2 ]
Sim, Alex [2 ]
Yoo, Shinjae [3 ]
Ito, Hiro [3 ]
Garonne, Vincent [3 ]
Lancon, Eric
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
[2] Lawrence Berkeley Natl Lab, Berkeley, CA USA
[3] Brookhaven Natl Lab, Upton, NY USA
来源
26TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS, CHEP 2023 | 2024年 / 295卷
关键词
D O I
10.1051/epjconf/202429501053
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The storage management system dCache acts as a disk cache for high-energy physics (HEP) data from the US ATLAS community. Since its disk capacity is considerably smaller than the total volume of ATLAS data, a heuristic is needed to determine what data should be kept on disks. An effective heuristic would be to keep the data files that are expected to be heavily accessed in the near future. Through a careful study of access statistics, we find a few most popular datasets are accessed much more frequently than others, even though these popular datasets change over time. If we could predict the near-term popularity of datasets, we could pin the most popular ones in the disk cache to prevent their accidental removal and guarantee their availability. To predict a dataset popularity, we present several methods for forecasting the number of times a dataset will be accessed in the next day. Test results show that these methods could predict the next-day access counts of popular datasets reliably. This observation is confirmed with dCache logs from two separate time ranges.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Quota management in dCache or making a perfectly normal file system normal
    Litvintsev, Dmitry
    Krishnaveni, Chitrapu
    Meyer, Svenja
    Millar, Paul
    Mkrtchyan, Tigran
    Morschel, Lea
    Rossi, Albert
    Sahakyan, Marina
    26TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS, CHEP 2023, 2024, 295
  • [32] Exploiting Thread-Data Affinity in OpenMP with Data Access Patterns
    Di Biagio, Andrea
    Speziale, Ettore
    Agosta, Giovanni
    EURO-PAR 2011 PARALLEL PROCESSING, PT 1, 2011, 6852 : 230 - 241
  • [33] Predicting memory-access cost based on data-access patterns
    Byna, S
    Sun, XH
    Gropp, W
    Thakur, R
    2004 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING, 2004, : 327 - 336
  • [34] Understanding Object-level Memory Access Patterns Across the Spectrum
    Ji, Xu
    Wang, Chao
    El-Sayed, Nosayba
    Ma, Xiaosong
    Kim, Youngjae
    Vazhkudai, Sudharshan S.
    Xue, Wei
    Sanchez, Daniel
    SC'17: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2017,
  • [35] Spatial Data Access Patterns in Semantic Grid environment
    Sorathia, Vikram
    Maitra, Anutosh
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS 2006: OTM 2006 WORKSHOPS, PT 2, PROCEEDINGS, 2006, 4278 : 1586 - +
  • [36] Characterization of shared data access patterns in UPC programs
    Barton, Christopher
    Cascaval, Calin
    Amaral, Jose Nelson
    LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, 2007, 4382 : 111 - +
  • [37] Lullaby: A Capture & Access System for Understanding the Sleep Environment
    Kay, Matthew
    Choe, Eun Kyoung
    Shepherd, Jesse
    Greenstein, Benjamin
    Watson, Nathaniel
    Consolvo, Sunny
    Kientz, Julie A.
    UBICOMP'12: PROCEEDINGS OF THE 2012 ACM INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING, 2012, : 226 - 235
  • [38] System of network access to physicochemical data
    A. A. Nikitin
    V. A. Titov
    L. I. Chernyavskii
    Journal of Structural Chemistry, 1998, 39 : 622 - 623
  • [39] System of network access to physicochemical data
    Nikitin, AA
    Titov, VA
    Chernyavskii, LI
    JOURNAL OF STRUCTURAL CHEMISTRY, 1998, 39 (04) : 622 - 623
  • [40] Mobile medical data access system
    Hunaiti, Z
    Rahman, A
    Huneiti, Z
    Balachandran, W
    15TH INTERNATIONAL CONFERENCE ON ELECTRONICS, COMMUNICATIONS AND COMPUTERS, PROCEEDINGS, 2005, : 2 - 6