Bioinformatics features based DNA Sequence data compression algorithm

被引:0
|
作者
Ji, Zhen [1 ]
Zhou, Jia-Rui [2 ]
Zhu, Ze-Xuan [1 ]
Wu, Q.H. [3 ]
机构
[1] College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, Guangdong 518060, China
[2] College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, Zhejiang 310027, China
[3] Department of Electrical Engineering and Electronics, The University of Liverpool, Liverpool, L69 3GJ, United Kingdom
来源
关键词
DNA sequences - Data compression - Markov processes - Clustering algorithms - DNA - Benchmarking;
D O I
暂无
中图分类号
学科分类号
摘要
A novel bioinformatics features based DNA Sequence data compression algorithm of BioLZMA is proposed in this paper. In BioLZMA, the DNA sequence data is sliced and reformed into 4 clusters according with biological meanings: the coding sequence cluster, the intron cluster, the RNA cluster and the residual cluster. By employing pointed compression strategies in data pre-processing, the clusters are compressed separately with LZMA algorithm. Experimental results demonstrated the better performance of BioLZMA than original DNA compression algorithms on benchmark sequences. Especially on long DNA sequence with significant bioinformatics features, BioLZMA algorithm can achieve higher compression ratio with little computation time.
引用
收藏
页码:991 / 995
相关论文
共 50 条
  • [41] DNA sequence classification based on MLP with PILAE algorithm
    Mohammed A. B. Mahmoud
    Ping Guo
    Soft Computing, 2021, 25 : 4003 - 4014
  • [42] An accurate DNA sequence assembly algorithm based on MapReduce
    Dong, Gaifang
    Fu, Xueliang
    Li, Honghui
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2016, 16 (03) : 519 - 526
  • [43] DNA Compression using Referential Compression Algorithm
    Mehta, Kanika
    Ghrera, Satya Prakash
    2015 EIGHTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2015, : 64 - 69
  • [44] A context based lossless compression algorithm for ionogram data
    Ye, H
    Devlin, JC
    Deng, G
    ISCAS '98 - PROCEEDINGS OF THE 1998 INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-6, 1998, : D321 - D324
  • [45] Research on Data Compression Technology based on Huffman Algorithm
    Yao Shu-jun
    2012 THIRD INTERNATIONAL CONFERENCE ON TELECOMMUNICATION AND INFORMATION (TEIN 2012), 2012, : 347 - 352
  • [46] Study on Data Compression Algorithm Based on Semantic Analysis
    Hong, Qi
    Lu, Xiaolei
    MATERIALS, MECHANICAL AND MANUFACTURING ENGINEERING, 2014, 842 : 712 - 716
  • [47] EEG data compression based on modified LADT algorithm
    Peng, S
    Fang, ZX
    Wei, DM
    IEEE-EMBS ASIA PACIFIC CONFERENCE ON BIOMEDICAL ENGINEERING - PROCEEDINGS, PTS 1 & 2, 2000, : 783 - 784
  • [48] Multisensor unscented filter algorithm based on data compression
    Guan, Xujun
    Rui, Guosheng
    Zhou, Xu
    Zhang, Yuling
    Wuhan Daxue Xuebao (Xinxi Kexue Ban)/ Geomatics and Information Science of Wuhan University, 2010, 35 (04): : 472 - 476
  • [49] Towards Context-Aware DNA Sequence Compression for Efficient Data Exchange
    Lohana, Wajeeta
    Shamsi, Jawwad A.
    Syed, Tahir Q.
    Hasan, Farrukh
    2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, 2015, : 357 - 366
  • [50] Sequence Similarity Alignment Algorithm in Bioinformatics: Techniques and Challenges
    Liu, Yuren
    Yan, Yijun
    Ren, Jinchang
    Marshall, Stephen
    ADVANCES IN BRAIN INSPIRED COGNITIVE SYSTEMS, 2020, 11691 : 550 - 560