UFold: fast and accurate RNA secondary structure prediction with deep learning

被引:90
|
作者
Fu, Laiyi [1 ,2 ]
Cao, Yingxin [2 ,5 ,6 ]
Wu, Jie [3 ]
Peng, Qinke [1 ]
Nie, Qing [4 ,5 ,6 ]
Xie, Xiaohui [2 ]
机构
[1] Xi An Jiao Tong Univ, Sch Elect & Informat Engn, Syst Engn Inst, Xian 710049, Shaanxi, Peoples R China
[2] Univ Calif Irvine, Dept Comp Sci, Irvine, CA 92697 USA
[3] Univ Calif Irvine, Dept Biol Chem, Irvine, CA 92697 USA
[4] Univ Calif Irvine, Dept Math, Irvine, CA 92697 USA
[5] Univ Calif Irvine, Ctr Complex Biol Syst, Irvine, CA 92697 USA
[6] Univ Calif Irvine, NSF Simons Ctr Multiscale Cell Fate Res, Irvine, CA 92697 USA
关键词
WEB SERVER; PROTEIN; DESIGN;
D O I
10.1093/nar/gkab1074
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
For many RNA molecules, the secondary structure is essential for the correct function of the RNA. Predicting RNA secondary structure from nucleotide sequences is a long-standing problem in genomics, but the prediction performance has reached a plateau over time. Traditional RNA secondary structure prediction algorithms are primarily based on thermodynamic models through free energy minimization, which imposes strong prior assumptions and is slow to run. Here, we propose a deep learning-based method, called UFold, for RNA secondary structure prediction, trained directly on annotated data and base-pairing rules. UFold proposes a novel image-like representation of RNA sequences, which can be efficiently processed by Fully Convolutional Networks (FCNs). We benchmark the performance of UFold on both within- and cross-family RNA datasets. It significantly outperforms previous methods on within-family datasets, while achieving a similar performance as the traditional methods when trained and tested on distinct RNA families. UFold is also able to predict pseudoknots accurately. Its prediction is fast with an inference time of about 160 ms per sequence up to 1500 bp in length. An online web server running UFold is available at . Code is available at .
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Deep learning models for RNA secondary structure prediction (probably) do not generalize across families
    Szikszai, Marcell
    Wise, Michael
    Datta, Amitava
    Ward, Max
    Mathews, David H.
    BIOINFORMATICS, 2022, 38 (16) : 3892 - 3899
  • [22] A Deep Learning Approach for Prediction of Protein Secondary Structure
    Zubair, Muhammad
    Hanif, Muhammad Kashif
    Alabdulkreem, Eatedal
    Ghadi, Yazeed
    Khan, Muhammad Irfan
    Sarwar, Muhammad Umer
    Hanif, Ayesha
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 72 (02): : 3705 - 3718
  • [23] Protein Secondary Structure Prediction Based on Deep Learning
    Zheng, Lin
    Li, Hong-ling
    Wu, Nan
    Ao, Li
    3RD INTERNATIONAL SYMPOSIUM ON MECHATRONICS AND INDUSTRIAL INFORMATICS, (ISMII 2017), 2017, : 171 - 177
  • [24] IPknot: fast and accurate prediction of RNA secondary structures with pseudoknots using integer programming
    Sato, Kengo
    Kato, Yuki
    Hamada, Michiaki
    Akutsu, Tatsuya
    Asai, Kiyoshi
    BIOINFORMATICS, 2011, 27 (13) : I85 - I93
  • [25] Review of machine learning methods for RNA secondary structure prediction
    Zhao, Qi
    Zhao, Zheng
    Fan, Xiaoya
    Yuan, Zhengwei
    Mao, Qian
    Yao, Yudong
    PLOS COMPUTATIONAL BIOLOGY, 2021, 17 (08)
  • [26] Sequence similarity governs generalizability of de novo deep learning models for RNA secondary structure prediction
    Qiu, Xiangyun
    PLOS COMPUTATIONAL BIOLOGY, 2023, 19 (04)
  • [27] Deep Learning Method for RNA Secondary Structure Prediction with Pseudoknots Based on Large-Scale Data
    Shen, Bowen
    Zhang, Hao
    Li, Cong
    Zhao, Tianheng
    Liu, Yuanning
    JOURNAL OF HEALTHCARE ENGINEERING, 2021, 2021
  • [28] A combined approach to RNA secondary structure prediction based on deep learning and minimum free energy model
    Hu, Xiaoling
    Ou, Xiujuan
    Yao, Hong
    Wang, Jun
    Xiao, Yi
    COMMUNICATIONS IN INFORMATION AND SYSTEMS, 2022, 22 (03) : 363 - 382
  • [29] Fast learning optimized prediction methodology (FLOPRED) for protein secondary structure prediction
    Saraswathi, S.
    Fernandez-Martinez, J. L.
    Kolinski, A.
    Jernigan, R. L.
    Kloczkowski, A.
    JOURNAL OF MOLECULAR MODELING, 2012, 18 (09) : 4275 - 4289
  • [30] Fast learning optimized prediction methodology (FLOPRED) for protein secondary structure prediction
    S. Saraswathi
    J. L. Fernández-Martínez
    A. Kolinski
    R. L. Jernigan
    A. Kloczkowski
    Journal of Molecular Modeling, 2012, 18 : 4275 - 4289