EBIC: an open source software for high-dimensional and big data analyses

被引：8

作者：

Orzechowski, Patryk ^{[1
,2
]}

Moore, Jason H. ^{[1
]}

机构：

[1] Univ Penn, Inst Biomed Informat, Philadelphia, PA 19104 USA

[2] AGH Univ Sci & Technol, Dept Automat & Robot, PL-30059 Krakow, Poland

来源：

BIOINFORMATICS | 2019年 / 35卷 / 17期

基金：

美国国家卫生研究院;

关键词：

D O I：

10.1093/bioinformatics/btz027

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Motivation: In this paper, we present an open source package with the latest release of Evolutionary-based BIClustering (EBIC), a next-generation biclustering algorithm for mining genetic data. The major contribution of this paper is adding a full support for multiple graphics processing units (GPUs) support, which makes it possible to run efficiently large genomic data mining analyses. Multiple enhancements to the first release of the algorithm include integration with R and Bioconductor, and an option to exclude missing values from the analysis. Results: Evolutionary-based BIClustering was applied to datasets of different sizes, including a large DNA methylation dataset with 436 444 rows. For the largest dataset we observed over 6.6-fold speedup in computation time on a cluster of eight GPUs compared to running the method on a single GPU. This proves high scalability of the method.

引用

页码：3181 / 3183

页数：3

共 50 条

[41] The challenge of complexity in the Big Data era: how to ride the wave of high-dimensional data revolution
Bossa, Cecilia
Branchi, Igor
Caccia, Barbara
Cisbani, Evaristo
Daniele, Carla
D'Avenio, Giuseppe
Esposito, Giuseppe
Facchiano, Francesco
Frustagli, Gianluca
Gagliardi, Roberta Valentina
Galluzzi, Andrea
Giansanti, Daniele
Gigante, Guido
Giuliani, Alessandro
Le Pera, Loredana
Mattia, Maurizio
Morelli, Sandra
Moro, Ornella
Palma, Alessandra
Pazienti, Antonio
Picconi, Orietta
Pizzi, Elisabetta
Poli, Cecilia
Ruspantini, Irene
Tait, Sabrina
Tcheremenskaia, Olga
ANNALI DELL ISTITUTO SUPERIORE DI SANITA, 2022, 58 (03): : 151 - 153
[42] On Criticality in High-Dimensional Data
Saremi, Saeed
Sejnowski, Terrence J.
NEURAL COMPUTATION, 2014, 26 (07) : 1329 - 1339
[43] High-Dimensional Data Bootstrap
Chernozhukov, Victor
Chetverikov, Denis
Kato, Kengo
Koike, Yuta
ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, 2023, 10 : 427 - 449
[44] High-dimensional data clustering
Bouveyron, C.
Girard, S.
Schmid, C.
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (01) : 502 - 519
[45] Visualizing high-dimensional data
Nature Methods, 2013, 10 (7) : 608 - 608
[46] High-dimensional data visualization
Tang, Lin
NATURE METHODS, 2020, 17 (02) : 129 - 129
[47] High-dimensional data visualization
Lin Tang
Nature Methods, 2020, 17 : 129 - 129
[48] Haery: A Hadoop Based Query System on Accumulative and High-Dimensional Data Model for Big Data
Song, Jie
He, HongYan
Thomas, Richard
Bao, Yubin
Yu, Ge
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (07) : 1362 - 1377
[49] High-dimensional Data Cubes
John, Sachin Basil
Koch, Christoph
PROCEEDINGS OF THE VLDB ENDOWMENT, 2022, 15 (13): : 3828 - 3840
[50] Modeling High-Dimensional Data
Vempala, Santosh S.
COMMUNICATIONS OF THE ACM, 2012, 55 (02) : 112 - 112

← 1 2 3 4 5 →