AnaCoDa: analyzing codon data with Bayesian mixture models

被引:7
|
作者
Landerer, Cedric [1 ,2 ]
Cope, Alexander [3 ,4 ]
Zaretzki, Russell [2 ,5 ]
Gilchrist, Michael A. [1 ,2 ]
机构
[1] Univ Tennessee, Dept Ecol & Evolutionary Biol, Knoxville, TN 37996 USA
[2] Univ Tennessee, Natl Inst Math & Biol Synth, Knoxville, TN 37996 USA
[3] Univ Tennessee, Genome Sci & Technol, Knoxville, TN USA
[4] Oak Ridge Natl Lab, Oak Ridge, TN USA
[5] Univ Tennessee, Dept Stat Operat & Management Sci, Knoxville, TN USA
基金
美国国家科学基金会;
关键词
SELECTION; USAGE; BIAS;
D O I
10.1093/bioinformatics/bty138
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
AnaCoDa is an R package for estimating biologically relevant parameters of mixture models, such as selection against translation inefficiency, non-sense errors and ribosome pausing time, from genomic and high throughput datasets. AnaCoDa provides an adaptive Bayesian MCMC algorithm, fully implemented in C++ for high performance with an ergonomic R interface to improve usability. AnaCoDa employs a generic object-oriented design to allow users to extend the framework and implement their own models. Current models implemented in AnaCoDa can accurately estimate biologically relevant parameters given either protein coding sequences or ribosome foot-printing data. Optionally, AnaCoDa can utilize additional data sources, such as gene expression measurements, to aid model fitting and parameter estimation. By utilizing a hierarchical object structure, some parameters can vary between sets of genes while others can be shared. Genes may be assigned to clusters or membership may be estimated by AnaCoDa. This flexibility allows users to estimate the same model parameter under different biological conditions and categorize genes into different sets based on shared model properties embedded within the data. AnaCoDa also allows users to generate simulated data which can be used to aid model development and model analysis as well as evaluate model adequacy. Finally, AnaCoDa contains a set of visualization routines and the ability to revisit or re-initiate previous model fitting, providing researchers with a well rounded easy to use framework to analyze genome scale data.
引用
收藏
页码:2496 / 2498
页数:3
相关论文
共 50 条
  • [1] Bayesian logit models with auxiliary mixture sampling for analyzing diabetes diagnosis data
    Rhee, Eun Hee
    Hwang, Beom Seuk
    KOREAN JOURNAL OF APPLIED STATISTICS, 2022, 35 (01) : 131 - 146
  • [2] Bayesian approach for mixture models with grouped data
    Gau, Shiow-Lan
    Tapsoba, Jean de Dieu
    Lee, Shen-Ming
    COMPUTATIONAL STATISTICS, 2014, 29 (05) : 1025 - 1043
  • [3] Bayesian approach for mixture models with grouped data
    Shiow-Lan Gau
    Jean de Dieu Tapsoba
    Shen-Ming Lee
    Computational Statistics, 2014, 29 : 1025 - 1043
  • [4] Bayesian mixture models for cytometry data analysis
    Lin, Lin
    Hejblum, Boris P.
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2021, 13 (04)
  • [5] Consensus Big Data Clustering for Bayesian Mixture Models
    Karras, Christos
    Karras, Aristeidis
    Giotopoulos, Konstantinos C.
    Avlonitis, Markos
    Sioutas, Spyros
    ALGORITHMS, 2023, 16 (05)
  • [6] Bayesian Comparisons of Codon Substitution Models
    Rodrigue, Nicolas
    Lartillot, Nicolas
    Philippe, Herve
    GENETICS, 2008, 180 (03) : 1579 - 1591
  • [7] Mixture models for analyzing product reliability data: a case study
    Ruhi, S.
    Sarker, S.
    Karim, M. R.
    SPRINGERPLUS, 2015, 4
  • [8] Analyzing Job Analysis Data Using Mixture Rasch Models
    Wyse, Adam E.
    INTERNATIONAL JOURNAL OF TESTING, 2019, 19 (01) : 52 - 73
  • [9] Bayesian compositional generalized linear models for analyzing microbiome data
    Zhang, Li
    Zhang, Xinyan
    Yi, Nengjun
    STATISTICS IN MEDICINE, 2024, 43 (01) : 141 - 155
  • [10] Bayesian estimation and classification with incomplete data using mixture models
    Zhang, JF
    Everson, R
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA'04), 2004, : 296 - 303