An Automated Infrastructure to Support High-Throughput Bioinformatics

被引:0
|
作者
Cuccuru, Gianmauro [1 ]
Leo, Simone [1 ]
Lianas, Luca [1 ]
Muggiri, Michele [1 ]
Pinna, Andrea [1 ]
Pireddu, Luca [1 ]
Uva, Paolo [1 ]
Angius, Andrea [1 ]
Fotia, Giorgio [1 ]
Zanetti, Gianluigi [1 ]
机构
[1] CRS4, Pula, CA, Italy
关键词
Bioinformatics; NGS; MapReduce; DATA-MANAGEMENT; VARIANTS; GALAXY;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The number of domains affected by the big data phenomenon is constantly increasing, both in science and industry, with high-throughput DNA sequencers being among the most massive data producers. Building analysis frameworks that can keep up with such a high production rate, however, is only part of the problem: current challenges include dealing with articulated data repositories where objects are connected by multiple relationships, managing complex processing pipelines where each step depends on a large number of configuration parameters and ensuring reproducibility, error control and usability by nontechnical staff. Here we describe an automated infrastructure built to address the above issues in the context of the analysis of the data produced by the CRS4 next-generation sequencing facility. The system integrates open source tools, either written by us or publicly available, into a framework that can handle the whole data transformation process, from raw sequencer output to primary analysis results.
引用
收藏
页码:600 / 607
页数:8
相关论文
共 50 条
  • [11] Automated high-throughput generation of droplets
    Guzowski, Jan
    Korczyk, Piotr M.
    Jakiela, Slawomir
    Garstecki, Piotr
    LAB ON A CHIP, 2011, 11 (21) : 3593 - 3595
  • [12] Automated, high-throughput photonic packaging
    Barwicz, Tymon
    Lichoulas, Ted W.
    Taira, Yoichi
    Martin, Yves
    Takenobu, Shotaro
    Janta-Polczynski, Alexander
    Numata, Hidetoshi
    Kimbrell, Eddie L.
    Nah, Jae-Woong
    Peng, Bo
    Childers, Darrell
    Leidy, Robert
    Khater, Marwan
    Kamlapurkar, Swetha
    Cyr, Elaine
    Engelmann, Sebastian
    Fortier, Paul
    Boyer, Nicolas
    OPTICAL FIBER TECHNOLOGY, 2018, 44 : 24 - 35
  • [13] Automated ATR for High-Throughput Laboratories
    Briggs, Jenni L.
    Sykora, Lorenz
    SPECTROSCOPY, 2018, 33 (09) : 52 - 52
  • [14] High-throughput automated gDNA extraction
    Roby, K
    Cu, M
    Fawcett, J
    GENETIC ENGINEERING NEWS, 2002, 22 (18): : 34 - +
  • [15] Integrated bioinformatics - High-throughput interpretation of pathways and biology
    Jain, E.
    Jain, K.
    Trends in Biotechnology, 2001, 19 (05) : 157 - 158
  • [16] Adaptive grid scheduling of a high-throughput bioinformatics application
    Huedo, E
    Montero, RS
    Llorente, IM
    PARALLEL PROCESSING AND APPLIED MATHEMATICS, 2004, 3019 : 840 - 847
  • [17] Food allergomics based on high-throughput and bioinformatics technologies
    Wang, Chong
    Wang, Yanbo
    Liu, Guangming
    Fu, Linglin
    FOOD RESEARCH INTERNATIONAL, 2020, 130
  • [18] High-throughput DNA Sequencing and Bioinformatics: Bottlenecks and Opportunities
    Tsui, Stephen Kwok-Wing
    2009 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING ( GRC 2009), 2009, : 4 - 4
  • [19] AutoLabDB: a substantial open source database schema to support a high-throughput automated laboratory
    Sparkes, Andrew
    Clare, Amanda
    BIOINFORMATICS, 2012, 28 (10) : 1390 - 1397
  • [20] Automated high-throughput DNA synthesis and assembly
    Ma, Yuxin
    Zhang, Zhaoyang
    Jia, Bin
    Yuan, Yingjin
    HELIYON, 2024, 10 (06)