An In-Depth Assessment of Sequence Clustering Software in Bioinformatics

被引:0
|
作者
Ju, Zhen [1 ,2 ]
Wang, Mingyu [3 ]
Li, Xuelei [1 ]
Meng, Jintao [1 ]
Xi, Wenhui [1 ]
Wei, Yanjie [1 ,4 ,5 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518005, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Shanxi Med Univ, Hosp 1, Dept Neurosurg, Taiyuan 030600, Peoples R China
[4] Shenzhen Inst Adv Technol, Shenzhen Key Lab Intelligent Bioinformat, Shenzhen 518055, Peoples R China
[5] Shenzhen Univ Adv Technol, Fac Comp Sci & Control Engn, Shenzhen 518055, Peoples R China
基金
美国国家科学基金会;
关键词
Sequence clustering; Precision; Speed; Scalability; Memory consumption;
D O I
10.1007/978-981-97-5128-0_29
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sequence clustering software is essential in bioinformatics, yet selecting the most suitable one poses a challenge due to its diverse algorithm design and targeted bioinformatics applications. This paper comprehensively reviewed the developments of most representative sequence clustering software and evaluated 8 representative software based on criteria such as precision, speed, scalability, and memory consumption. This paper divides the clustering software into four aspects: NMI scores greater than 0.95, running time less than 1min/h, 64 core acceleration exceeding 30 times, and memory consumption less than 3 times the dataset, and summarizes them into a table for user querying. Finally, taking OTU, tree of life building, and metagenomic analysis, as examples, this paper demonstrates how to analyze the requirements of scenarios for clustering software and provides recommendations for selecting the most suitable one based on evaluation results.
引用
收藏
页码:359 / 370
页数:12
相关论文
共 50 条
  • [21] Testing research software: an in-depth survey of practices, methods, and tools
    Eisty, Nasir U.
    Kanewala, Upulee
    Carver, Jeffrey C.
    EMPIRICAL SOFTWARE ENGINEERING, 2025, 30 (03)
  • [22] PAPI software-defined events for in-depth performance analysis
    Jagode, Heike
    Danalis, Anthony
    Anzt, Hartwig
    Dongarra, Jack
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2019, 33 (06): : 1113 - 1127
  • [23] A Retrospective Study of Software Analytics Projects: In-Depth Interviews with Practitioners
    Misirli, Ayse Tosun
    Caglayan, Bora
    Bener, Ayse
    Turhan, Burak
    IEEE SOFTWARE, 2013, 30 (05) : 54 - 61
  • [24] An in-depth JAVA']JAVA Teaching Exploration into the Software Engineering Curriculum
    Xu, Qing-Wei
    PROCEEDINGS OF THE 2015 3D INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION AND COMMUNICATION TECHNOLOGY FOR EDUCATION, 2015, 11 : 204 - 206
  • [25] In-Depth Analysis of Computer Memory Acquisition Software for Forensic Purposes
    McDown, Robert J.
    Varol, Cihan
    Carvajal, Leonardo
    Chen, Lei
    JOURNAL OF FORENSIC SCIENCES, 2016, 61 : S110 - S116
  • [26] Software as a service (SaaS) testing challenges-An in-depth analysis
    Prakash, V.
    Ramadoss, Ravikumar
    Gopalakrishnan, S.
    International Journal of Computer Science Issues, 2012, 9 (3 3-3): : 506 - 510
  • [27] Software Dedicated to Virus Sequence Analysis "Bioinformatics Goes Viral"
    Hoelzer, Martin
    Marz, Manja
    IN LOEFFLER'S FOOTSTEPS - VIRAL GENOMICS IN THE ERA OF HIGH-THROUGHPUT SEQUENCING, 2017, 99 : 233 - 257
  • [28] A maintainable software architecture for fast and modular Bioinformatics sequence search
    Archuleta, Jeremy
    Tilevich, Eli
    Feng, Wu-chun
    2007 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, 2007, : 314 - 323
  • [29] An In-depth Analysis of Fuzzy C-means Clustering for Cellular Manufacturing
    Li, Jie
    Chu, Chao-hsien
    Wang, Yunfeng
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 1, PROCEEDINGS, 2008, : 42 - +
  • [30] An in-depth assessment of expert sprint coaches' technical knowledge
    Thompson, Andy
    Bezodis, Ian N.
    Jones, Robyn L.
    JOURNAL OF SPORTS SCIENCES, 2009, 27 (08) : 855 - 861