Analysing large biological data sets with an improved algorithm for MIC

被引:10
|
作者
Wang, Shuliang [1 ]
Zhao, Yiping [1 ]
机构
[1] Beijing Inst Technol, Sch Software, Beijing 100081, Peoples R China
关键词
maximal information coefficient; improved algorithm for MIC; biological annotations; big data; NETWORKS; DISTANCES;
D O I
10.1504/IJDMB.2015.071548
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The computational framework used the traditional similarity measures to find out the significant relationships in biological annotations. But its prerequisites that the biological annotations do not cooccur with each other is particular. To overcome it, in this paper a new method Improved Algorithm for Maximal Information Coefficient (IAMIC) is suggested to discover the hidden regularities between biological annotations. IAMIC approximates a novel similarity coefficient on maximal information coefficient with generality and equitability, by bettering axis partition through quadratic optimisation instead of violence search. The experimental results show that IAMIC is more appropriate for identifying the associations between biological annotations, and further extracting the novel associations hidden in collected data sets than other similarity measures.
引用
收藏
页码:158 / 170
页数:13
相关论文
共 50 条
  • [1] On Improved 3MIC Algorithm on Exploring Large Data Sets with Multi-variables and Application
    Jiang, Yushan
    Zhang, Qingling
    2015 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS IHMSC 2015, VOL I, 2015, : 157 - 160
  • [2] Parallel Clustering Algorithm for Large-Scale Biological Data Sets
    Wang, Minchao
    Zhang, Wu
    Ding, Wang
    Dai, Dongbo
    Zhang, Huiran
    Xie, Hao
    Chen, Luonan
    Guo, Yike
    Xie, Jiang
    PLOS ONE, 2014, 9 (04):
  • [3] An Improved Affinity Propagation Clustering Algorithm for Large-scale Data Sets
    Liu, Xiaonan
    Yin, Meijuan
    Luo, Junyong
    Chen, Wuping
    2013 NINTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION (ICNC), 2013, : 894 - 899
  • [4] Improved qARM algorithm for frequent itemsets search in large-scale data sets
    Qi, Han
    Wang, Liyuan
    Fu, Dianshuo
    Gani, Abdullah
    Gong, Changqing
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (05):
  • [5] An Improved Error-Based Pruning Algorithm of Decision Trees on Large Data Sets
    Peng, Yi
    Lu, Yu-Tong
    Chen, Zhi-Guang
    2021 IEEE 6TH INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (ICBDA 2021), 2021, : 33 - 37
  • [6] Performance of an ensemble clustering algorithm on biological data sets
    Pirim, Harun
    Gautam, Dilip
    Bhowmik, Tanmay
    Perkins, Andy D.
    Ekşioglu, Burak
    Alkan, Ahmet
    Mathematical and Computational Applications, 2011, 16 (01) : 87 - 96
  • [7] An Improved Algorithm for SVMs Classification of Imbalanced Data Sets
    Castro, Cristiano Leite
    Carvalho, Mateus Araujo
    Braga, Antonio Padua
    ENGINEERING APPLICATIONS OF NEURAL NETWORKS, PROCEEDINGS, 2009, 43 : 108 - 118
  • [8] Novel approach to analysing large data sets of personal sun exposure measurements
    Suzana M Blesić
    Đorđe I Stratimirović
    Jelena V Ajtić
    Caradee Y Wright
    Martin W Allen
    Journal of Exposure Science & Environmental Epidemiology, 2016, 26 : 613 - 620
  • [9] Novel approach to analysing large data sets of personal sun exposure measurements
    Blesic, Suzana M.
    Stratimirovic, Dorde I.
    Ajtic, Jelena V.
    Wright, Caradee Y.
    Allen, Martin W.
    JOURNAL OF EXPOSURE SCIENCE AND ENVIRONMENTAL EPIDEMIOLOGY, 2016, 26 (06) : 613 - 620
  • [10] Data management and extraction of biological information from large data sets
    Mount, David
    IN VITRO CELLULAR & DEVELOPMENTAL BIOLOGY-ANIMAL, 2008, 44 : S3 - S4