PlasmidHunter: accurate and fast prediction of plasmid sequences using gene content profile and machine learning

被引:2
|
作者
Tian, Renmao [1 ]
Zhou, Jizhong [2 ]
Imanian, Behzad [1 ,3 ]
机构
[1] IIT, Inst Food Safety & Hlth, 6502 S Archer Rd, Bedford Pk, IL 60501 USA
[2] Univ Oklahoma, Inst Environm Genom, Dept Microbiol & Plant Biol, 101 David Boren Blvd, Norman, OK 73019 USA
[3] IIT, Food Sci & Nutr Dept, 10 West 35th St, Chicago, IL 60616 USA
关键词
artificial intelligence (AI); machine learning (ML); plasmid prediction; genomic sequencing; RESISTANCE;
D O I
10.1093/bib/bbae322
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Plasmids are extrachromosomal DNA found in microorganisms. They often carry beneficial genes that help bacteria adapt to harsh conditions. Plasmids are also important tools in genetic engineering, gene therapy, and drug production. However, it can be difficult to identify plasmid sequences from chromosomal sequences in genomic and metagenomic data. Here, we have developed a new tool called PlasmidHunter, which uses machine learning to predict plasmid sequences based on gene content profile. PlasmidHunter can achieve high accuracies (up to 97.6%) and high speeds in benchmark tests including both simulated contigs and real metagenomic plasmidome data, outperforming other existing tools.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Prediction of Sulfur Content in Copra Using Machine Learning Algorithm
    Sagayaraj, A. S.
    Devi, T. K.
    Umadevi, S.
    APPLIED ARTIFICIAL INTELLIGENCE, 2021, 35 (15) : 2228 - 2245
  • [22] Machine learning for profile prediction in genomics
    Schreiber, Jacob
    Singh, Ritambhara
    CURRENT OPINION IN CHEMICAL BIOLOGY, 2021, 65 : 35 - 41
  • [23] Accurate prediction of wood moisture content using terahertz time-domain spectroscopy combined with machine learning algorithms
    Yu, Min
    Yan, Jia
    Chu, Jiawei
    Qi, Hang
    Xu, Peng
    Liu, Shengquan
    Zhou, Liang
    Gao, Junlan
    INDUSTRIAL CROPS AND PRODUCTS, 2025, 227
  • [24] SIDEpro: A novel machine learning approach for the fast and accurate prediction of side-chain conformations
    Nagata, Ken
    Randall, Arlo
    Baldi, Pierre
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2012, 80 (01) : 142 - 153
  • [25] Comparative analysis of machine learning techniques for accurate prediction of unfrozen water content in frozen soils
    Li, Jiaxian
    Zhou, Pengcheng
    Pu, Yiqing
    Ren, Junping
    Zhang, Fanyu
    Wang, Chong
    COLD REGIONS SCIENCE AND TECHNOLOGY, 2024, 227
  • [26] PlasGUN: gene prediction in plasmid metagenomic short reads using deep learning
    Fang, Zhencheng
    Tan, Jie
    Wu, Shufang
    Li, Mo
    Wang, Chunhui
    Liu, Yongchu
    Zhu, Huaiqiu
    BIOINFORMATICS, 2020, 36 (10) : 3239 - 3241
  • [27] A Fast and Accurate Machine Learning Autograder for the Breakout Assignment
    Liu, Evan Zheran
    Yuan, David
    Ahmed, Ahmed
    Cornwall, Elyse
    Woodrow, Juliette
    Burns, Kaylee
    Nie, Allen
    Brunskill, Emma
    Piech, Chris
    PROCEEDINGS OF THE 55TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, SIGCSE 2024, VOL. 1, 2024, : 736 - 742
  • [28] Fast and Accurate Uncertainty Estimation in Chemical Machine Learning
    Musil, Felix
    Willatt, Michael J.
    Langovoy, Mikhail A.
    Ceriotti, Michele
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2019, 15 (02) : 906 - 915
  • [29] Fast and accurate modeling of molecular energies with machine learning
    Rupp, Matthias
    Tkatchenko, Alexandre
    Mueller, Klaus-Robert
    von Lilienfeld, O. Anatole
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2012, 243
  • [30] Fast and Efficient Cross Band Channel Prediction Using Machine Learning
    Bakshi, Arjun
    Mao, Yifan
    Srinivasan, Kannan
    Parthasarathy, Srinivasan
    MOBICOM'19: PROCEEDINGS OF THE 25TH ANNUAL INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING, 2019,