Analysing large biological data sets with an improved algorithm for MIC

被引：10

作者：

Wang, Shuliang ^{[1
]}

Zhao, Yiping ^{[1
]}

机构：

[1] Beijing Inst Technol, Sch Software, Beijing 100081, Peoples R China

来源：

INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS | 2015年 / 13卷 / 02期

关键词：

maximal information coefficient; improved algorithm for MIC; biological annotations; big data; NETWORKS; DISTANCES;

D O I：

10.1504/IJDMB.2015.071548

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

The computational framework used the traditional similarity measures to find out the significant relationships in biological annotations. But its prerequisites that the biological annotations do not cooccur with each other is particular. To overcome it, in this paper a new method Improved Algorithm for Maximal Information Coefficient (IAMIC) is suggested to discover the hidden regularities between biological annotations. IAMIC approximates a novel similarity coefficient on maximal information coefficient with generality and equitability, by bettering axis partition through quadratic optimisation instead of violence search. The experimental results show that IAMIC is more appropriate for identifying the associations between biological annotations, and further extracting the novel associations hidden in collected data sets than other similarity measures.

引用

页码：158 / 170

页数：13

共 50 条

[21] NON-LINEAR MAPPING ALGORITHM FOR LARGE DATA SETS
SCHACHTER, B
COMPUTER GRAPHICS AND IMAGE PROCESSING, 1978, 8 (02): : 271 - 276
[22] An Efficient Algorithm for Discovering Motifs in Large DNA Data Sets
Yu, Qiang
Huo, Hongwei
Chen, Xiaoyang
Guo, Haitao
Vitter, Jeffrey Scott
Huan, Jun
IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2015, 14 (05) : 535 - 544
[23] A FAST ALGORITHM FOR TRANSPOSING LARGE MULTIDIMENSIONAL IMAGE DATA SETS
VANHEEL, M
ULTRAMICROSCOPY, 1991, 38 (01) : 75 - 83
[24] An Efficient Motif Finding Algorithm for Large DNA Data Sets
Yu, Qiang
Huo, Hongwei
Chen, Xiaoyang
Guo, Haitao
Vitter, Jeffrey Scott
Huan, Jun
2014 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2014,
[25] Fuzzy granular principal curves algorithm for large data sets
Zhang, Hongyun
Miao, Duoqian
Pedrycz, Witold
PROCEEDINGS OF THE 2013 JOINT IFSA WORLD CONGRESS AND NAFIPS ANNUAL MEETING (IFSA/NAFIPS), 2013, : 956 - 961
[26] An Approximate Median Polish Algorithm for Large Multidimensional Data Sets
Daniel Barbará
Xintao Wu
Knowledge and Information Systems, 2003, 5 (4) : 416 - 438
[27] Preprocessing Large Data Sets by the Use of Quick Sort Algorithm
Wozniak, Marcin
Marszalek, Zbigniew
Gabryel, Marcin
Nowicki, Robert K.
KNOWLEDGE, INFORMATION AND CREATIVITY SUPPORT SYSTEMS: RECENT TRENDS, ADVANCES AND SOLUTIONS, KICSS 2013, 2016, 364 : 111 - 121
[28] Parallel Clustering Algorithm for Large Data Sets with Applications in Bioinformatics
Olman, Victor
Mao, Fenglou
Wu, Hongwei
Xu, Ying
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2009, 6 (02) : 344 - 352
[29] Modified Merge Sort Algorithm for Large Scale Data Sets
Wozniak, Marcin
Marszalek, Zbigniew
Gabryel, Marcin
Nowicki, Robert K.
ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT II, 2013, 7895 : 612 - +
[30] An improved association rule mining algorithm for large data
Zhao, Zhenyi
Jian, Zhou
Gaba, Gurjot Singh
Alroobaea, Roobaea
Masud, Mehedi
Rubaiee, Saeed
JOURNAL OF INTELLIGENT SYSTEMS, 2021, 30 (01) : 750 - 762

← 1 2 3 4 5 →