PipeVal: light-weight extensible tool for file validation

被引:1
|
作者
Patel, Yash [1 ,2 ]
Beshlikyan, Arpi [1 ,2 ]
Jordan, Madison [1 ,2 ]
Kim, Gina [1 ,2 ]
Holmes, Aaron [1 ,2 ]
Yamaguchi, Takafumi N. [1 ,2 ,3 ]
Boutros, Paul C. [1 ,2 ,3 ,4 ,5 ,6 ]
机构
[1] Univ Calif Los Angeles, Jonsson Comprehens Canc Ctr, Los Angeles, CA 90095 USA
[2] Univ Calif Los Angeles, Inst Precis Hlth, Los Angeles, CA 90095 USA
[3] Univ Calif Los Angeles, Dept Human Genet, Los Angeles, CA 90095 USA
[4] Univ Calif Los Angeles, Dept Urol, Los Angeles, CA 90095 USA
[5] Univ Calif Los Angeles, Broad Stem Cell Res Ctr, Los Angeles, CA 90095 USA
[6] Univ Calif Los Angeles, Dept Human Genet, 10833 Le Conte Ave, Los Angeles, CA 90095 USA
基金
美国国家卫生研究院;
关键词
FORMAT;
D O I
10.1093/bioinformatics/btae079
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation The volume of biomedical data generated each year is growing exponentially as high-throughput molecular, imaging and mHealth technologies expand. This rise in data volume has contributed to an increasing reliance on and demand for computational methods, and consequently to increased attention to software quality and data integrity.Results To simplify data verification in diverse data-processing pipelines, we created PipeVal, a light-weight, easy-to-use, extensible tool for file validation. It is open-source, easy to integrate with complex workflows, and modularized for extensibility for new file formats. PipeVal can be rapidly inserted into existing methods and pipelines to automatically validate and verify inputs and outputs. This can reduce wasted compute time attributed to file corruption or invalid file paths, and significantly improve the quality of data-intensive software.Availability and implementation PipeVal is an open-source Python package under the GPLv2 license and it is freely available at https://github.com/uclahs-cds/package-PipeVal. The docker image is available at: https://github.com/uclahs-cds/package-PipeVal/pkgs/container/pipeval.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] PrePeP: A Light-Weight, Extensible Tool for Predicting Frequent Hitters
    Couronne, Christophe
    Koptelov, Maksim
    Zimmermann, Albrecht
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: APPLIED DATA SCIENCE AND DEMO TRACK, ECML PKDD 2020, PT V, 2021, 12461 : 570 - 573
  • [2] ReviewR: a light-weight and extensible tool for manual review of clinical records
    Mayer, David A.
    Rasmussen, Luke, V
    Roark, Christopher D.
    Kahn, Michael G.
    Schilling, Lisa M.
    Wiley, Laura K.
    JAMIA OPEN, 2022, 5 (03)
  • [3] VFS_CS: A light-weight and extensible virtual file system middleware for cloud storage system
    Saifeng Z.
    International Journal of Computational Science and Engineering, 2020, 21 (04): : 513 - 521
  • [4] VFS_CS: a light-weight and extensible virtual file system middleware for cloud storage system
    Zeng, Saifeng
    INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2020, 21 (04) : 513 - 521
  • [5] Light-Weight Containers for Isabelle: Efficient, Extensible, Nestable
    Lochbihler, Andreas
    INTERACTIVE THEOREM PROVING, ITP 2013, 2013, 7998 : 116 - 132
  • [6] Tool Support for the Integration of Light-Weight Ontologies
    Heer, Thomas
    Retkowitz, Daniel
    Kraft, Bodo
    ENTERPRISE INFORMATION SYSTEMS-B, 2009, 19 : 175 - +
  • [7] Tool support for the integration of light-weight ontologies
    Heer, Thomas
    Retkowitz, Daniel
    Kraft, Bodo
    Lecture Notes in Business Information Processing, 2009, 19 : 175 - 187
  • [8] A light-weight, collaborative temporary file system for clustered Web servers
    Wang, J
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2006, 66 (05) : 750 - 762
  • [9] An extensible light-weight XML-based monitoring system for sequence databases
    Van de Craen, Dieter
    Neven, Frank
    Koch, Kerstin
    DATA INTEGRATION IN THE LIFE SCIENCES, PROCEEDINGS, 2006, 4075 : 280 - +
  • [10] Light-Weight File Fragments Classification Using Depthwise Separable Convolutions
    Saaim, Kunwar Muhammed
    Felemban, Muhamad
    Alsaleh, Saleh
    Almulhem, Ahmad
    ICT SYSTEMS SECURITY AND PRIVACY PROTECTION (SEC 2022), 2022, 648 : 196 - 211