Tersect: a set theoretical utility for exploring sequence variant data

被引:4
|
作者
Kurowski, Tomasz J. [1 ]
Mohareb, Fady [1 ]
机构
[1] Cranfield Univ, Sch Water Energy & Environm, Bioinformat Grp, Bedford MK43 0AL, England
基金
英国生物技术与生命科学研究理事会;
关键词
D O I
10.1093/bioinformatics/btz634
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
A Summary: Comparing genomic features among a large panel of individuals across the same species is considered nowadays a core part of the bioinformatics analyses. This typically involves a series of complex theoretical expressions to compare, intersect, extract symmetric differences between individuals within a large set of genotypes. Several publically available tools are capable of performing such tasks; however, due to the sheer size of variants being queried, such tasks can be computationally expensive with a runtime ranging from few minutes up to several hours depending on the dataset size. This makes existing tools unsuitable for interactive data query or as part of genomic data visualization platforms such as genome browsers. Tersect is a lightweight, high-performance command-line utility which interprets and applies flexible set theoretical expressions to sets of sequence variant data. It can be used both for interactive data exploration and as part of a larger pipeline thanks to its highly optimized storage and indexing algorithms for variant data.
引用
收藏
页码:934 / 935
页数:2
相关论文
共 50 条
  • [31] Leptospirosis in Latin America: exploring the first set of regional data
    Schneider, Maria Cristina
    Leonel, Deise Galan
    Hamrick, Patricia Najera
    de Caldas, Eduardo Pacheco
    Teresa Velasquez, Reina
    Mendigana Paez, Fernando Antonio
    Gonzalez Arrebato, Jusayma Caridad
    Gerger, Andrea
    Pereira, Martha Maria
    Aldighieri, Sylvain
    REVISTA PANAMERICANA DE SALUD PUBLICA-PAN AMERICAN JOURNAL OF PUBLIC HEALTH, 2017, 41
  • [32] Exploring cosmological expansion parametrizations with the gold SnIa data set
    Lazkoz, R
    Nesseris, S
    Perivolaropoulos, L
    JOURNAL OF COSMOLOGY AND ASTROPARTICLE PHYSICS, 2005, (11): : 161 - 174
  • [33] Experimental Analysis of Multiattribute Utility Collaborative Filtering on a Synthetic Data Set
    Manouselis, Nikos
    Costopoulou, Constantina
    PERSONALIZATION TECHNIQUES AND RECOMMENDER SYSTEMS, 2008, : 111 - 134
  • [34] A Unified Formal Description of Arithmetic and Set Theoretical Data Types
    Tarau, Paul
    INTELLIGENT COMPUTER MATHEMATICS, 2010, 6167 : 247 - 261
  • [35] SEARCHING DATA FOR PREDICTIVE VARIABLES - SET-THEORETICAL APPROACH
    MUIR, DE
    SOCIOLOGICAL INQUIRY, 1969, 39 (01) : 27 - 35
  • [36] Exploring the tradeoff between data privacy and utility with a clinical data analysis use case
    Im, Eunyoung
    Kim, Hyeoneui
    Lee, Hyungbok
    Jiang, Xiaoqian
    Kim, Ju Han
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2024, 24 (01)
  • [37] Rare variant phasing using paired tumor:normal sequence data
    Alexandra R. Buckley
    Trey Ideker
    Hannah Carter
    Nicholas J. Schork
    BMC Bioinformatics, 20
  • [38] Rare variant phasing using paired tumor:normal sequence data
    Buckley, Alexandra R.
    Ideker, Trey
    Carter, Hannah
    Schork, Nicholas J.
    BMC BIOINFORMATICS, 2019, 20 (1)
  • [39] Exploring Dance Movement Data Using Sequence Alignment Methods
    Chavoshi, Seyed Hossein
    De Baets, Bernard
    Neutens, Tijs
    De Tre, Guy
    Van de Weghe, Nico
    PLOS ONE, 2015, 10 (07):
  • [40] Exploring Exploratory Data Analysis: An Empirical Test of Run Chart Utility
    Barsalou, Matthew
    Saraiva, Pedro Manuel
    Henriques, Roberto
    MANAGEMENT SYSTEMS IN PRODUCTION ENGINEERING, 2023, 31 (04) : 442 - 448