Bwasw-Cloud: Efficient Sequence Alignment Algorithm for Two Big Data with MapReduce

被引:0
|
作者
Sun, Mingming [1 ]
Zhou, Xuehai [1 ]
Yang, Feng [1 ]
Lu, Kun [1 ]
Dai, Dong [2 ]
机构
[1] Univ Sci & Technol China, Comp Sci, Hefei 230026, Peoples R China
[2] Texas Tech Univ, Comp Sci, Lubbock, TX 79409 USA
基金
中国博士后科学基金; 美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The recent next-generation sequencing machines generate sequences at an unprecedented rate, and a sequence is not short any more called read. The reference sequences which are aligned reads against are also increasingly large. Efficiently mapping large number of long sequences with big reference sequences poses a new challenge to sequence alignment. Sequence alignment algorithms become to match on two big data. To address the above problem, we propose a new parallel sequence alignment algorithm called Bwasw-Cloud, optimized for aligning long reads against a large sequence data (e.g. the human genome). It is modeled after the widely used BWA-SW algorithm and uses the open-source Hadoop implementation of Map Reduce. The results show that Bwasw-Cloud can effectively and quickly match two big data in common cluster.
引用
收藏
页码:213 / 218
页数:6
相关论文
共 50 条
  • [31] Efficient Querying Distributed Big-XML Data using MapReduce
    Song Kunfang
    Hongwei Lu
    INTERNATIONAL JOURNAL OF GRID AND HIGH PERFORMANCE COMPUTING, 2016, 8 (03) : 70 - 79
  • [32] Strategic alignment of Cloud-based Architectures for Big Data
    Schmidt, Rainer
    Moehring, Michael
    17TH IEEE INTERNATIONAL ENTERPRISE DISTRIBUTED OBJECT COMPUTING CONFERENCE WORKSHOPS (EDOCW 2013), 2013, : 136 - 143
  • [33] AMPO: Algorithm for MapReduce Performance Optimization for Enhancing Big Data Analytics
    Yambem, Nandita
    Nandakumar, A. N.
    2017 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER, AND OPTIMIZATION TECHNIQUES (ICEECCOT), 2017, : 717 - 723
  • [34] Big Data Prediction Framework for Weather Temperature Based on MapReduce Algorithm
    Ismail, Khalid Adam
    Majid, Mazlina Abdul
    Zain, Jasni Mohamed
    Abu Bakar, Noor Akma
    2016 IEEE CONFERENCE ON OPEN SYSTEMS, 2016, : 13 - 17
  • [35] Efficient MapReduce Kernel k-Means for Big Data Clustering
    Tsapanos, Nikolaos
    Tefas, Anastasios
    Nikolaidis, Nikolaos
    Pitas, Ioannis
    9TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE (SETN 2016), 2016,
  • [36] A Similar Duplicate Record Detection Algorithm for Big Data Based on MapReduce
    Song R.
    Yu T.
    Chen Y.
    Chen Y.
    Xia B.
    Shanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University, 2018, 52 (02): : 214 - 221
  • [37] An Enhanced Memetic Algorithm for Feature Selection in Big Data Analytics with MapReduce
    Ramakrishnan, Umanesan
    Nachimuthu, Nandhagopal
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2022, 31 (03): : 1547 - 1559
  • [38] A Top-k Query Algorithm for Big Data Based on MapReduce
    Lin, Xueyan
    PROCEEDINGS OF 2015 6TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE, 2015, : 982 - 985
  • [39] Parallel Clustering Optimization Algorithm Based on MapReduce in Big Data Mining
    Zhang, Huajie
    Song, Lei
    Zhang, Sen
    IAENG International Journal of Applied Mathematics, 2023, 53 (01):
  • [40] Efficient finer-grained incremental processing with MapReduce for big data
    Zhang, Liang
    Feng, Yuanyuan
    Shen, Peiyi
    Zhu, Guangming
    Wei, Wei
    Song, Juan
    Shah, Syed Afaq Ali
    Bennamoun, Mohammed
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 80 : 102 - 111