Structural representations of DNA regulatory substrates can enhance sequence-based algorithms by associating functional sequence variants

被引:1
|
作者
Zrimec, Jan [1 ]
机构
[1] Chalmers Univ Technol, Gothenburg, Sweden
关键词
Regulatory genomics; DNA structural properties; Bio-algorithms; PREDICTION; SHAPE;
D O I
10.1145/3388440.3412482
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The nucleotide sequence representation of DNA can be inadequate for resolving protein-DNA binding sites and regulatory substrates, such as those involved in gene expression and horizontal gene transfer. Considering that sequence-like representations are algorithmically very useful, here we fused over 60 currently available DNA physicochemical and conformational variables into compact structural representations that can encode single DNA binding sites to whole regulatory regions. We find that the main structural components reflect key properties of protein-DNA interactions and can be condensed to the amount of information found in a single nucleotide position. The most accurate structural representations compress functional DNA sequence variants by 30% to 50%, as each instance encodes from tens to thousands of sequences. We show that a structural distance function discriminates among groups of DNA substrates more accurately than nucleotide sequence-based metrics. As this opens up a variety of implementation possibilities, we develop and test a distance-based alignment algorithm, demonstrating the potential of using the structural representations to enhance sequence-based algorithms. Due to the bias of most current bioinformatic methods to nucleotide sequence representations, it is possible that considerable performance increases might still be achievable with such solutions.
引用
收藏
页数:6
相关论文
共 9 条
  • [1] A sequence-based method to predict the impact of regulatory variants using random forest
    Liu, Qiao
    Gan, Mingxin
    Jiang, Rui
    [J]. BMC SYSTEMS BIOLOGY, 2017, 11
  • [2] Modeling DNA sequence-based cis-regulatory gene networks
    Bolouri, H
    Davidson, EH
    [J]. DEVELOPMENTAL BIOLOGY, 2002, 246 (01) : 2 - 13
  • [3] Structural descriptor database: a new tool for sequence-based functional site prediction
    Bernardes, Juliana S.
    Fernandez, Jorge H.
    Vasconcelos, Ana Tereza R.
    [J]. BMC BIOINFORMATICS, 2008, 9 (1)
  • [4] Structural descriptor database: a new tool for sequence-based functional site prediction
    Juliana S Bernardes
    Jorge H Fernandez
    Ana Tereza R Vasconcelos
    [J]. BMC Bioinformatics, 9
  • [5] TFpredict and SABINE: Sequence-Based Prediction of Structural and Functional Characteristics of Transcription Factors
    Eichner, Johannes
    Topf, Florian
    Draeger, Andreas
    Wrzodek, Clemens
    Wanke, Dierk
    Zell, Andreas
    [J]. PLOS ONE, 2013, 8 (12):
  • [6] Exploring Machine Learning Algorithms and Numerical Representations Strategies to Develop Sequence-Based Predictive Models for Protein Networks
    Medina-Ortiz, David
    Salinas, Pedro
    Cabas-Moras, Gabriel
    Duran-Verdugo, Fabio
    Olivera-Nappa, Alvaro
    Uribe-Paredes, Roberto
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2023, PT I, 2023, 13956 : 231 - 244
  • [7] Structural and functional sequence test of dynamic and state-based software with evolutionary algorithms
    Baresel, A
    Pohlheim, H
    Sadeghipour, S
    [J]. GENETIC AND EVOLUTIONARY COMPUTATION - GECCO 2003, PT II, PROCEEDINGS, 2003, 2724 : 2428 - 2441
  • [8] Sequence-Based Structural Stability Modulate Biological Processing of AFB1-Fapy-dG Adduct by NEIL1 DNA glycosylase
    Tomar, Rachana
    Minko, Irina
    Kellum, Andrew
    Voehler, Markus
    McCullough, Amanda
    Lloyd, R.
    Stone, Michael
    [J]. FASEB JOURNAL, 2021, 35
  • [9] Covariance of Charged Amino Acids at Positions 322 and 440 of HIV-1 Env Contributes to Coreceptor Specificity of Subtype B Viruses, and Can Be Used to Improve the Performance of V3 Sequence-Based Coreceptor Usage Prediction Algorithms
    Cashin, Kieran
    Sterjovski, Jasminka
    Harvey, Katherine L.
    Ramsland, Paul A.
    Churchill, Melissa J.
    Gorry, Paul R.
    [J]. PLOS ONE, 2014, 9 (10):