Big data analytics in bioinformatics: architectures, techniques, tools and issues

被引:23
|
作者
Kashyap H. [1 ]
Ahmed H.A. [2 ]
Hoque N. [2 ]
Roy S. [3 ]
Bhattacharyya D.K. [2 ]
机构
[1] Department of Computer Science, Donald Bren School of Information and Computer Sciences, University of California Irvine, 3019 Donald Bren Hall, Irvine, 92697-3435, CA
[2] Department of Computer Science and Engineering, Tezpur University, Tezpur
[3] Department of Information Technology, North Eastern Hill University, Shillong
关键词
Big data; Bioinformatics; Clustering; Gene regulatory network; Machine learning; MapReduce;
D O I
10.1007/s13721-016-0135-4
中图分类号
学科分类号
摘要
Bioinformatics research is characterized by voluminous and incremental datasets and complex data analytics methods. The machine learning methods used in bioinformatics are iterative and parallel. These methods can be scaled to handle big data using the distributed and parallel computing technologies. Usually big data tools perform computation in batch mode and are not optimized for iterative processing and high data dependency among operations. In the recent years, parallel, incremental, and multi-view machine learning algorithms have been proposed. Similarly, graph-based architectures and in-memory big data tools have been developed to minimize I/O cost and optimize iterative processing. However, standard big data architectures are still lacking. Also appropriate tools are not available for many important bioinformatics problems, such as fast construction of co-expression and regulatory networks and salient module identification, detection of complexes over growing protein-protein interaction data, fast analysis of massive DNA, RNA, and protein sequence data, and fast querying on incremental and heterogeneous disease networks. This paper addresses the issues and challenges posed by several big data problems in bioinformatics, and gives an overview of the state of the art and the future research opportunities. © 2016, Springer-Verlag Wien.
引用
收藏
相关论文
共 50 条
  • [1] Big Data Visual Analytics: Fundamentals, Techniques, and Tools
    Quang Vinh Nguyen
    Engelke, Ulrich
    [J]. SA'17: SIGGRAPH ASIA 2017 COURSES, 2017,
  • [2] WAN Optimization Tools, Techniques and Research Issues for Cloud-based Big Data Analytics
    Nirmala, M. Baby
    [J]. 2014 WORLD CONGRESS ON COMPUTING AND COMMUNICATION TECHNOLOGIES (WCCCT 2014), 2014, : 280 - 285
  • [3] A Survey on Big Data Analytics: Challenges, Open Research Issues and Tools
    Acharjya, D. P.
    Ahmed, Kauser P.
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (02) : 511 - 518
  • [4] AI and Big Data Analytics for Health and Bioinformatics
    Kwoh, Chee Kcong
    [J]. PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SYSTEMS-BIOLOGY AND BIOINFORMATICS (CSBIO 2017), 2017, : 1 - 1
  • [5] Big Data and Advanced Analytics Tools
    Chawda, Rahul Kumar
    Thakur, Ghanshyam
    [J]. 2016 SYMPOSIUM ON COLOSSAL DATA ANALYSIS AND NETWORKING (CDAN), 2016,
  • [6] Big Data Analytics on Heterogeneous Accelerator Architectures
    Neshatpour, Katayoun
    Sasan, Avesta
    Homayoun, Houman
    [J]. 2016 INTERNATIONAL CONFERENCE ON HARDWARE/SOFTWARE CODESIGN AND SYSTEM SYNTHESIS (CODES+ISSS), 2016,
  • [7] Quality Issues with Big data Analytics
    Sangeeta
    Sharma, Kapil
    [J]. PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 3589 - 3591
  • [8] Big data analytics: six techniques
    Shu, Hong
    [J]. GEO-SPATIAL INFORMATION SCIENCE, 2016, 19 (02) : 119 - 128
  • [9] Techniques for Graph Analytics on Big Data
    Nisar, M. Usman
    Fard, Arash
    Miller, John A.
    [J]. 2013 IEEE INTERNATIONAL CONGRESS ON BIG DATA, 2013, : 255 - 262
  • [10] Big Data Analytics Techniques: A Survey
    Vashisht, Poonam
    Gupta, Vishal
    [J]. 2015 INTERNATIONAL CONFERENCE ON GREEN COMPUTING AND INTERNET OF THINGS (ICGCIOT), 2015, : 264 - 269