Comparison of codon usage measures and their applicability in prediction of microbial gene expressivity

被引:95
|
作者
Supek, F
Vlahovicek, K
机构
[1] Univ Zagreb, Fac Sci, Div Biol, Dept Mol Biol, Zagreb 10000, Croatia
[2] Int Ctr Genet Engn & Biotechnol, I-34012 Trieste, Italy
关键词
D O I
10.1186/1471-2105-6-182
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: There are a number of methods (also called: measures) currently in use that quantify codon usage in genes. These measures are often influenced by other sequence properties, such as length. This can introduce strong methodological bias into measurements; therefore we attempted to develop a method free from such dependencies. One of the common applications of codon usage analyses is to quantitatively predict gene expressivity. Results: We compared the performance of several commonly used measures and a novel method we introduce in this paper - Measure Independent of Length and Composition (MILC). Large, randomly generated sequence sets were used to test for dependence on (i) sequence length, (ii) overall amount of codon bias and (iii) codon bias discrepancy in the sequences. A derivative of the method, named MELP (MILC-based Expression Level Predictor) can be used to quantitatively predict gene expression levels from genomic data. It was compared to other similar predictors by examining their correlation with actual, experimentally obtained mRNA or protein abundances. Conclusion: We have established that MILC is a generally applicable measure, being resistant to changes in gene length and overall nucleotide composition, and introducing little noise into measurements. Other methods, however, may also be appropriate in certain applications. Our efforts to quantitatively predict gene expression levels in several prokaryotes and unicellular eukaryotes met with varying levels of success, depending on the experimental dataset and predictor used. Out of all methods, MELP and Rainer Merkl's GCB method had the most consistent behaviour. A 'reference set' containing known ribosomal protein genes appears to be a valid starting point for a codon usage-based expressivity prediction.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Di-codon Usage for Gene Classification
    Nguyen, Minh N.
    Ma, Jianmin
    Fogel, Gary B.
    Rajapakse, Jagath C.
    PATTERN RECOGNITION IN BIOINFORMATICS, PROCEEDINGS, 2009, 5780 : 211 - +
  • [22] Prediction of gene expression under drought stress in spring wheat using codon usage pattern
    Almutairi, Meshal M.
    Alrajhi, Abdullah A.
    SAUDI JOURNAL OF BIOLOGICAL SCIENCES, 2021, 28 (07) : 4000 - 4004
  • [23] Comparison measures and their usage with examples
    Saastamoinen, Kalle
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KES 2019), 2019, 159 : 1027 - 1034
  • [24] Selective forces and mutational biases drive stop codon usage in the human genome: a comparison with sense codon usage
    Edoardo Trotta
    BMC Genomics, 17
  • [25] Selective forces and mutational biases drive stop codon usage in the human genome: a comparison with sense codon usage
    Trotta, Edoardo
    BMC GENOMICS, 2016, 17
  • [26] Evolutionary patterns of codon usage in the chloroplast gene rbcL
    Wall, DP
    Herbeck, JT
    JOURNAL OF MOLECULAR EVOLUTION, 2003, 56 (06) : 673 - 688
  • [27] Codon usage by chloroplast gene is bias in Hemiptelea davidii
    Liu, Huabo
    Lu, Yizeng
    Lan, Baoliang
    Xu, Jichen
    JOURNAL OF GENETICS, 2020, 99 (01)
  • [28] CODON USAGE IS IMPOSED BY THE GENE LOCATION IN THE TRANSCRIPTION UNIT
    DELORME, MO
    HENAUT, A
    CURRENT GENETICS, 1991, 20 (05) : 353 - 358
  • [29] Codon usage and lateral gene transfer in Bacillus subtilis
    Moszer, I
    Rocha, EPC
    Danchin, A
    CURRENT OPINION IN MICROBIOLOGY, 1999, 2 (05) : 524 - 528
  • [30] Codon degeneracy and amino acid abundance influence the measures of codon usage bias: improved Nc (Nc) and ENCprime (Nc) measures
    Satapathy, Siddhartha Sankar
    Sahoo, Ajit Kumar
    Ray, Suvendra Kumar
    Ghosh, Tapash Chandra
    GENES TO CELLS, 2017, 22 (03) : 277 - 283