TimeCloth: Fast Point-in-Time Database Recovery in The Cloud

被引:0
|
作者
Deng, Jianjun [1 ]
Lu, Jianan [2 ]
Fan, Hua [1 ]
Liu, Chaoyang [1 ]
Cheng, Shi [1 ]
Fu, Cuiyun [1 ]
Zhou, Wenchao [1 ]
机构
[1] Alibaba Grp, Hangzhou, Peoples R China
[2] Princeton Univ, Princeton, NJ USA
关键词
database recovery; point-in-time recovery; user-triggered recovery; transparent lazy loading;
D O I
10.1145/3626246.3653382
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, we have noted the frequent occurrence of user-triggered database recoveries in the cloud. In contrast to traditional failure-triggered recovery scenarios, they come with unique consumer-centric challenges. Unfortunately, current solutions prove inefficient, falling short either in meeting our customers' requirements or due to their close integration with the native recovery support of the underlying database engines. In this work, we present TimeCloth, a generic cloud-native recovery mechanism that achieves sublinear recovery time while meeting the specific needs of our customers. It comprises a recovery module optimized for fine-grained point-in-time recoveries and an import module enabling nearly instantaneous access to remote tables. The recovery module performs fast log filtering, parallelizes replay of non-conflicting log records and coalesce log records to reduce the volume of replay work. The import module implements a transparent FUSE-based lazy loading mechanism as well as a smart prefetcher to achieve good access performance for remote tables. Collectively, they significantly accelerate user-triggered recoveries in the cloud. TimeCloth is launched in production at Alibaba Cloud for about 15 months. We have witnesses a reduction in RTO among our customers by 44% on average and sometimes up to 92%.
引用
收藏
页码:214 / 226
页数:13
相关论文
共 50 条
  • [1] S-TRAP: Optimization and Evaluation of Timely Recovery to Any Point-in-time (TRAP)
    Wang, Chao
    Li, Zhanhuai
    Hu, Na
    Nie, Yanming
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2012, 9 (01) : 431 - 454
  • [2] Cyclical adjustment of point-in-time PD
    Ingolfsson, S.
    Elvarsson, B. T.
    JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2010, 61 (03) : 374 - 380
  • [3] TRAP-Array: A disk array architecture providing timely recovery to any point-in-time
    Yang, Qing
    Mao, Weijun
    Ren, Jin
    33RD INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHTIECTURE, PROCEEDINGS, 2006, : 289 - 300
  • [4] Homeless People in Nicaragua: A Point-in-Time Count in Leon
    Suarez, Alexia
    Berrios, Alberto
    Bonilla, Enrique
    Juan Vazquez, Jose
    JOURNAL OF INTERNATIONAL DEVELOPMENT, 2018, 30 (01) : 155 - 158
  • [5] Learning More From Homeless Point-in-Time Counts
    Shinn, Marybeth
    Yu, Hanxuan
    Zoltowski, Alisa R.
    Wu, Hao
    HOUSING POLICY DEBATE, 2024,
  • [6] A point-in-time perspective on through-the-cycle ratings
    Altman, EI
    Rijken, HA
    FINANCIAL ANALYSTS JOURNAL, 2006, 62 (01) : 54 - 70
  • [7] PREVALENCE OF HOMELESSNESS AMONG HOSPITALIZED PATIENTS: A POINT-IN-TIME SURVEY
    Mistry, Neelam
    Knoeckel, Julie
    Johnson, Amanda V.
    Bredenberg, Erin
    Raffel, Katie
    Cunningham, John
    Sarcone, Ellen
    Misky, Gregory
    McBeth, Lauren
    Stella, Sarah A.
    JOURNAL OF GENERAL INTERNAL MEDICINE, 2023, 38 : S197 - S198
  • [8] Improving Homeless Point-In-Time Counts: Uncovering the Marginally Housed
    Smith, Curtis
    Castaneda-Tinoco, Ernesto
    SOCIAL CURRENTS, 2019, 6 (02) : 91 - 104
  • [9] Olive: distributed point-in-time branching storage for real systems
    Aguilera, Marcos K.
    Spence, Susan
    Veitch, Alistair
    USENIX ASSOCIATION PROCEEDINGS OF THE 3RD SYMPOSIUM ON NETWORKED SYSTEMS DESIGN & IMPLEMENTATION (NSDI 06), 2006, : 367 - +
  • [10] Live loads in office buildings: point-in-time load intensity
    Kumar, S
    BUILDING AND ENVIRONMENT, 2002, 37 (01) : 79 - 89