EBIC: an open source software for high-dimensional and big data analyses

被引:8
|
作者
Orzechowski, Patryk [1 ,2 ]
Moore, Jason H. [1 ]
机构
[1] Univ Penn, Inst Biomed Informat, Philadelphia, PA 19104 USA
[2] AGH Univ Sci & Technol, Dept Automat & Robot, PL-30059 Krakow, Poland
基金
美国国家卫生研究院;
关键词
D O I
10.1093/bioinformatics/btz027
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: In this paper, we present an open source package with the latest release of Evolutionary-based BIClustering (EBIC), a next-generation biclustering algorithm for mining genetic data. The major contribution of this paper is adding a full support for multiple graphics processing units (GPUs) support, which makes it possible to run efficiently large genomic data mining analyses. Multiple enhancements to the first release of the algorithm include integration with R and Bioconductor, and an option to exclude missing values from the analysis. Results: Evolutionary-based BIClustering was applied to datasets of different sizes, including a large DNA methylation dataset with 436 444 rows. For the largest dataset we observed over 6.6-fold speedup in computation time on a cluster of eight GPUs compared to running the method on a single GPU. This proves high scalability of the method.
引用
收藏
页码:3181 / 3183
页数:3
相关论文
共 50 条
  • [21] An intelligent clustering algorithm for high-dimensional multiview data in big data applications
    Tao, Qian
    Gu, Chunqin
    Wang, Zhenyu
    Jiang, Daoning
    NEUROCOMPUTING, 2020, 393 : 234 - 244
  • [22] Mortality prediction based on imbalanced high-dimensional ICU big data
    Liu, Jiankang
    Chen, Xian Xiang
    Fang, Lipeng
    Li, Jun Xia
    Yang, Ting
    Zhan, Qingyuan
    Tong, Kai
    Fang, Zhen
    COMPUTERS IN INDUSTRY, 2018, 98 : 218 - 225
  • [23] A Manifold Learning Framework for Reducing High-dimensional Big Text Data
    Salem, Rashed
    2017 12TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND SYSTEMS (ICCES), 2017, : 347 - 352
  • [24] Bayesian Dynamic Feature Partitioning in High-Dimensional Regression With Big Data
    Gutierrez, Rene
    Guhaniyogi, Rajarshi
    TECHNOMETRICS, 2022, 64 (02) : 224 - 240
  • [25] High-dimensional data
    Geubbelmans, Melvin
    Rousseau, Axel-Jan
    Valkenborg, Dirk
    Burzykowski, Tomasz
    AMERICAN JOURNAL OF ORTHODONTICS AND DENTOFACIAL ORTHOPEDICS, 2023, 164 (03) : 453 - 456
  • [26] High-dimensional data
    Amaratunga, Dhammika
    Cabrera, Javier
    JOURNAL OF THE NATIONAL SCIENCE FOUNDATION OF SRI LANKA, 2016, 44 (01): : 3 - 9
  • [27] Ultra High-Dimensional Nonlinear Feature Selection for Big Biological Data
    Yamada, Makoto
    Tang, Jiliang
    Lugo-Martinez, Jose
    Hodzic, Ermin
    Shrestha, Raunak
    Saha, Avishek
    Ouyang, Hua
    Yin, Dawei
    Mamitsuka, Hiroshi
    Sahinalp, Cenk
    Radivojac, Predrag
    Menczer, Filippo
    Chang, Yi
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (07) : 1352 - 1365
  • [28] An Optimal Big Data Analytics with Concept Drift Detection on High-Dimensional Streaming Data
    Mansour, Romany F.
    Al-Otaibi, Shaha
    Al-Rasheed, Amal
    Aljuaid, Hanan
    Pustokhina, Irina, V
    Pustokhin, Denis A.
    CMC-COMPUTERS MATERIALS & CONTINUA, 2021, 68 (03): : 2843 - 2858
  • [29] Two Steps Genetic Programming for Big Data Perspective of Distributed and High-Dimensional Data
    Huang, Jih-Jeng
    2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, : 753 - 756
  • [30] DEIMoS GUI: An Open-Source User Interface for a High-Dimensional Mass Spectrometry Data Processing Tool
    Oostrom, Marjolein T.
    Colby, Sean M.
    Metz, Thomas O.
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2024, 64 (05) : 1419 - 1424