HaTSPiL: A modular pipeline for high-throughput sequencing data analysis

被引:1
|
作者
Morandi, Edoardo [1 ,2 ]
Cereda, Matteo [2 ]
Incarnato, Danny [1 ,2 ]
Parlato, Caterina [2 ]
Basile, Giulia [2 ]
Anselmi, Francesca [1 ,2 ]
Lauria, Andrea [1 ,2 ]
Simon, Lisa Marie [1 ,2 ]
Polignano, Isabelle Laurence [1 ]
Arruga, Francesca [2 ]
Deaglio, Silvia [2 ,3 ]
Tirtei, Elisa [4 ]
Fagioli, Franca [4 ]
Oliviero, Salvatore [1 ,2 ]
机构
[1] Univ Turin, Dept Life Sci & Syst Biol, Turin, Italy
[2] IIGM, Turin, Italy
[3] Univ Turin, Dept Med Sci, Turin, Italy
[4] City Sci & Hlth Turin, Regina Margherita Childrens Hosp, Stem Cell Transplantat & Cellular Therapy Div, Paediat Oncohaematol, Turin, Italy
来源
PLOS ONE | 2019年 / 14卷 / 10期
关键词
CANCER; TOOL;
D O I
10.1371/journal.pone.0222512
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background Next generation sequencing methods are widely adopted for a large amount of scientific purposes, from pure research to health-related studies. The decreasing costs per analysis led to big amounts of generated data and to the subsequent improvement of software for the respective analyses. As a consequence, many approaches have been developed to chain different software in order to obtain reliable and reproducible workflows. However, the large range of applications for NGS approaches entails the challenge to manage many different workflows without losing reliability. Methods We here present a high-throughput sequencing pipeline (HaTSPiL), a Python-powered CLI tool designed to handle different approaches for data analysis with a high level of reliability. The software relies on the barcoding of filenames using a human readable naming convention that contains any information regarding the sample needed by the software to automatically choose different workflows and parameters. HaTSPiL is highly modular and customisable, allowing the users to extend its features for any specific need. Conclusions HaTSPiL is licensed as Free Software under the MIT license and it is available at https://github.com/dodomorandi/hatspil.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] SangeR: the high-throughput Sanger sequencing analysis pipeline
    Schmid, Kai
    Dohmen, Hildegard
    Ritschel, Nadja
    Selignow, Carmen
    Zohner, Jochen
    Sehring, Jannik
    Acker, Till
    Amsel, Daniel
    Stamatakis, Alexandros
    [J]. BIOINFORMATICS ADVANCES, 2022, 2 (01):
  • [2] A novel multi-alignment pipeline for high-throughput sequencing data
    Huang, Shunping
    Holt, James
    Kao, Chia-Yu
    McMillan, Leonard
    Wang, Wei
    [J]. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2014,
  • [3] Need for speed in high-throughput sequencing data analysis
    Pluss, M.
    Caspar, S. M.
    Meienberg, J.
    Kopps, A. M.
    Keller, I.
    Bruggmann, R.
    Vogel, M.
    Matyas, G.
    [J]. EUROPEAN JOURNAL OF HUMAN GENETICS, 2018, 26 : 721 - 722
  • [4] A Conceptual Model for Transcriptome High-Throughput Sequencing Pipeline
    Huacarpuma, Ruben Cruz
    Holanda, Maristela
    Walter, Maria Emilia
    [J]. ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2011, 6832 : 71 - 74
  • [5] Strategy for Modular Tagged High-Throughput Amplicon Sequencing
    de Carcer, Daniel Aguirre
    Denman, Stuart E.
    McSweeney, Chris
    Morrison, Mark
    [J]. APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2011, 77 (17) : 6310 - 6312
  • [6] iSVP: an integrated structural variant calling pipeline from high-throughput sequencing data
    Mimori, Takahiro
    Nariai, Naoki
    Kojima, Kaname
    Takahashi, Mamoru
    Ono, Akira
    Sato, Yukuto
    Yamaguchi-Kabata, Yumi
    Nagasaki, Masao
    [J]. BMC SYSTEMS BIOLOGY, 2013, 7
  • [7] fluff: exploratory analysis and visualization of high-throughput sequencing data
    Georgiou, Georgios
    van Heeringen, Simon J.
    [J]. PEERJ, 2016, 4
  • [8] QIIME allows analysis of high-throughput community sequencing data
    J Gregory Caporaso
    Justin Kuczynski
    Jesse Stombaugh
    Kyle Bittinger
    Frederic D Bushman
    Elizabeth K Costello
    Noah Fierer
    Antonio Gonzalez Peña
    Julia K Goodrich
    Jeffrey I Gordon
    Gavin A Huttley
    Scott T Kelley
    Dan Knights
    Jeremy E Koenig
    Ruth E Ley
    Catherine A Lozupone
    Daniel McDonald
    Brian D Muegge
    Meg Pirrung
    Jens Reeder
    Joel R Sevinsky
    Peter J Turnbaugh
    William A Walters
    Jeremy Widmann
    Tanya Yatsunenko
    Jesse Zaneveld
    Rob Knight
    [J]. Nature Methods, 2010, 7 : 335 - 336
  • [9] Pyicos: a versatile toolkit for the analysis of high-throughput sequencing data
    Althammer, Sonja
    Gonzalez-Vallinas, Juan
    Ballare, Cecilia
    Beato, Miguel
    Eyras, Eduardo
    [J]. BIOINFORMATICS, 2011, 27 (24) : 3333 - 3340
  • [10] QIIME allows analysis of high-throughput community sequencing data
    Caporaso, J. Gregory
    Kuczynski, Justin
    Stombaugh, Jesse
    Bittinger, Kyle
    Bushman, Frederic D.
    Costello, Elizabeth K.
    Fierer, Noah
    Pena, Antonio Gonzalez
    Goodrich, Julia K.
    Gordon, Jeffrey I.
    Huttley, Gavin A.
    Kelley, Scott T.
    Knights, Dan
    Koenig, Jeremy E.
    Ley, Ruth E.
    Lozupone, Catherine A.
    McDonald, Daniel
    Muegge, Brian D.
    Pirrung, Meg
    Reeder, Jens
    Sevinsky, Joel R.
    Tumbaugh, Peter J.
    Walters, William A.
    Widmann, Jeremy
    Yatsunenko, Tanya
    Zaneveld, Jesse
    Knight, Rob
    [J]. NATURE METHODS, 2010, 7 (05) : 335 - 336