Spacemake: processing and analysis of large-scale spatial transcriptomics data

被引:10
|
作者
Sztanka-Toth, Tamas Ryszard [1 ,2 ]
Jens, Marvin [1 ]
Karaiskos, Nikos [1 ]
Rajewsky, Nikolaus [1 ,2 ,3 ,4 ]
机构
[1] Max Delbruck Ctr Mol Med, Helmholtz Assoc MDC, Berlin Inst Med Syst Biol BIMSB, Syst Biol Gene Regulatory Elements, D-10115 Berlin, Germany
[2] Humboldt Univ, Inst Biol, D-10099 Berlin, Germany
[3] DZHK German Ctr Cardiovasc Res, Partner Site Berlin, D-10117 Berlin, Germany
[4] Univ Med Charite, Dept Pediat Oncol, D-13353 Berlin, Germany
来源
GIGASCIENCE | 2022年 / 11卷
基金
欧盟地平线“2020”;
关键词
bioinformatics; computational biology; computational pipeline; sequence analysis; spatial transcriptomics; single-cell transcriptomics; reproducibility; modularity; scalability; workflow; SEQ; ARCHITECTURE; EXPRESSION;
D O I
10.1093/gigascience/giac064
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background Spatial sequencing methods increasingly gain popularity within RNA biology studies. State-of-the-art techniques quantify messenger RNA expression levels from tissue sections and at the same time register information about the original locations of the molecules in the tissue. The resulting data sets are processed and analyzed by accompanying software that, however, is incompatible across inputs from different technologies. Findings Here, we present spacemake, a modular, robust, and scalable spatial transcriptomics pipeline built in Snakemake and Python. Spacemake is designed to handle all major spatial transcriptomics data sets and can be readily configured for other technologies. It can process and analyze several samples in parallel, even if they stem from different experimental methods. Spacemake's unified framework enables reproducible data processing from raw sequencing data to automatically generated downstream analysis reports. Spacemake is built with a modular design and offers additional functionality such as sample merging, saturation analysis, and analysis of long reads as separate modules. Moreover, spacemake employs novoSpaRc to integrate spatial and single-cell transcriptomics data, resulting in increased gene counts for the spatial data set. Spacemake is open source and extendable, and it can be seamlessly integrated with existing computational workflows.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Spacemake: processing and analysis of large-scale spatial transcriptomics data
    Sztanka-Toth, Tamas Ryszard
    Jens, Marvin
    Karaiskos, Nikos
    Rajewsky, Nikolaus
    [J]. GIGASCIENCE, 2022, 11
  • [2] Determinants of Large-Scale Spatial Data Processing in Polish Mining
    Kosydor, Pawel
    Warchala, Ewa
    Krawczyk, Artur
    Piorkowski, Adam
    [J]. XIXTH CONFERENCE OF PHD STUDENTS AND YOUNG SCIENTISTS: INTERDISCIPLINARY TOPICS IN MINING AND GEOLOGY, 2020, 2209
  • [3] GeoSpark: A Cluster Computing Framework for Processing Large-Scale Spatial Data
    Yu, Jia
    Wu, Jinxuan
    Sarwat, Mohamed
    [J]. 23RD ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS (ACM SIGSPATIAL GIS 2015), 2015,
  • [4] Active disks for large-scale data processing
    Riedel, E
    Faloutsos, C
    Gibson, GA
    Nagle, D
    [J]. COMPUTER, 2001, 34 (06) : 68 - +
  • [5] AUTOMATING LARGE-SCALE PROCESSING OF DOSIMETRY DATA
    PAWLYK, DA
    SIEGEL, JA
    SHARKEY, RM
    GOLDENBERG, DM
    [J]. JOURNAL OF NUCLEAR MEDICINE, 1993, 34 (05): : P160 - P160
  • [6] Processing large-scale data with Apache Spark
    Ko, Seyoon
    Won, Joong-Ho
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2016, 29 (06) : 1077 - 1094
  • [7] Large-Scale Spatial Join Query Processing in Cloud
    You, Simin
    Zhang, Jianting
    Gruenwald, Le
    [J]. 2015 13TH IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW), 2015, : 34 - 41
  • [8] MRMkit: Automated Data Processing for Large-Scale Targeted Metabolomics Analysis
    Teo, Guoshou
    Chew, Wee Siong
    Burla, Bo J.
    Herr, Deron
    Tai, E. Shyong
    Wenk, Markus R.
    Torta, Federico
    Choi, Hyungwon
    [J]. ANALYTICAL CHEMISTRY, 2020, 92 (20) : 13677 - 13682
  • [9] MetHoS: a platform for large-scale processing, storage and analysis of metabolomics data
    Konstantinos Tzanakis
    Tim W. Nattkemper
    Karsten Niehaus
    Stefan P. Albaum
    [J]. BMC Bioinformatics, 23
  • [10] MetHoS: a platform for large-scale processing, storage and analysis of metabolomics data
    Tzanakis, Konstantinos
    Nattkemper, Tim W.
    Niehaus, Karsten
    Albaum, Stefan P.
    [J]. BMC BIOINFORMATICS, 2022, 23 (01)