Analysing large biological data sets with an improved algorithm for MIC

被引:10
|
作者
Wang, Shuliang [1 ]
Zhao, Yiping [1 ]
机构
[1] Beijing Inst Technol, Sch Software, Beijing 100081, Peoples R China
关键词
maximal information coefficient; improved algorithm for MIC; biological annotations; big data; NETWORKS; DISTANCES;
D O I
10.1504/IJDMB.2015.071548
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The computational framework used the traditional similarity measures to find out the significant relationships in biological annotations. But its prerequisites that the biological annotations do not cooccur with each other is particular. To overcome it, in this paper a new method Improved Algorithm for Maximal Information Coefficient (IAMIC) is suggested to discover the hidden regularities between biological annotations. IAMIC approximates a novel similarity coefficient on maximal information coefficient with generality and equitability, by bettering axis partition through quadratic optimisation instead of violence search. The experimental results show that IAMIC is more appropriate for identifying the associations between biological annotations, and further extracting the novel associations hidden in collected data sets than other similarity measures.
引用
收藏
页码:158 / 170
页数:13
相关论文
共 50 条
  • [21] NON-LINEAR MAPPING ALGORITHM FOR LARGE DATA SETS
    SCHACHTER, B
    COMPUTER GRAPHICS AND IMAGE PROCESSING, 1978, 8 (02): : 271 - 276
  • [22] An Efficient Algorithm for Discovering Motifs in Large DNA Data Sets
    Yu, Qiang
    Huo, Hongwei
    Chen, Xiaoyang
    Guo, Haitao
    Vitter, Jeffrey Scott
    Huan, Jun
    IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2015, 14 (05) : 535 - 544
  • [23] A FAST ALGORITHM FOR TRANSPOSING LARGE MULTIDIMENSIONAL IMAGE DATA SETS
    VANHEEL, M
    ULTRAMICROSCOPY, 1991, 38 (01) : 75 - 83
  • [24] An Efficient Motif Finding Algorithm for Large DNA Data Sets
    Yu, Qiang
    Huo, Hongwei
    Chen, Xiaoyang
    Guo, Haitao
    Vitter, Jeffrey Scott
    Huan, Jun
    2014 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2014,
  • [25] Fuzzy granular principal curves algorithm for large data sets
    Zhang, Hongyun
    Miao, Duoqian
    Pedrycz, Witold
    PROCEEDINGS OF THE 2013 JOINT IFSA WORLD CONGRESS AND NAFIPS ANNUAL MEETING (IFSA/NAFIPS), 2013, : 956 - 961
  • [26] An Approximate Median Polish Algorithm for Large Multidimensional Data Sets
    Daniel Barbará
    Xintao Wu
    Knowledge and Information Systems, 2003, 5 (4) : 416 - 438
  • [27] Preprocessing Large Data Sets by the Use of Quick Sort Algorithm
    Wozniak, Marcin
    Marszalek, Zbigniew
    Gabryel, Marcin
    Nowicki, Robert K.
    KNOWLEDGE, INFORMATION AND CREATIVITY SUPPORT SYSTEMS: RECENT TRENDS, ADVANCES AND SOLUTIONS, KICSS 2013, 2016, 364 : 111 - 121
  • [28] Parallel Clustering Algorithm for Large Data Sets with Applications in Bioinformatics
    Olman, Victor
    Mao, Fenglou
    Wu, Hongwei
    Xu, Ying
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2009, 6 (02) : 344 - 352
  • [29] Modified Merge Sort Algorithm for Large Scale Data Sets
    Wozniak, Marcin
    Marszalek, Zbigniew
    Gabryel, Marcin
    Nowicki, Robert K.
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT II, 2013, 7895 : 612 - +
  • [30] An improved association rule mining algorithm for large data
    Zhao, Zhenyi
    Jian, Zhou
    Gaba, Gurjot Singh
    Alroobaea, Roobaea
    Masud, Mehedi
    Rubaiee, Saeed
    JOURNAL OF INTELLIGENT SYSTEMS, 2021, 30 (01) : 750 - 762