Two Novel Techniques for Space Compaction on Biological Sequences

被引:1
|
作者
Volis, George [1 ]
Makris, Christos [1 ]
Kanavos, Andreas [1 ]
机构
[1] Univ Patras, Dept Comp Engn & Informat, Patras 26504, Greece
关键词
Searching and Browsing; Web Information Filtering and Retrieval; Text Mining; Indexing Structures; Inverted Files; Index Compression-Gram Indexing; Sequence Analysis and Assembly; COMPUTATION;
D O I
10.5220/0005801101050112
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The number and size of genomic databases have grown rapidly the last years. Consequently, the number of Internet-accessible databases has been rapidly growing. Therefore there is a need for satisfactory methods for managing this growing information. A lot of effort has been put to this direction. Contributing to this effort this paper presents two algorithms which can eliminate the amount of space for storing genomic information. Our first algorithm is based on the classic n-grams/2L technique for indexing a DNA sequence and it can convert the Inverted Index of this classic algorithm to a more compressed format. Researchers have revealed the existence of repeated and palindrome patterns in DNA of living organisms. The main motivation of this technique is based on this remark and proposes an alternative data structure for handling these sequences. Our experimental results show that our algorithm can achieve a more efficient index than the n-grams/2L algorithm and can be adapted by any algorithm that is based to n-grams/2L The second algorithm is based on the n-grams technique. Perceiving the four symbols of DNA alphabet as vertex of a square scheme imprint a DNA sequence as a relation between vertices, sides and diagonals of a square. The experimental results shows that this second idea succeed even more successfully compression of our index structure.
引用
收藏
页码:105 / 112
页数:8
相关论文
共 50 条
  • [41] Novel Techniques to Solve Space-Exploration Problems
    Wagner, Rick
    Volpe, Richard
    Visentin, Gianfranco
    IEEE ROBOTICS & AUTOMATION MAGAZINE, 2009, 16 (04) : 13 - 13
  • [42] Exact static compaction of independent test sequences
    Raik, J
    Jutman, A
    Ubar, R
    BEC 2002: PROCEEDINGS OF THE 8TH BIENNIAL BALTIC ELECTRONIC CONFERENCE, 2002, : 315 - 318
  • [43] Biological sequences as pictures – a generic two dimensional solution for iterated maps
    Jonas S Almeida
    Susana Vinga
    BMC Bioinformatics, 10
  • [44] Biological sequences as pictures - a generic two dimensional solution for iterated maps
    Almeida, Jonas S.
    Vinga, Susana
    BMC BIOINFORMATICS, 2009, 10
  • [45] Two Novel Space-Time Coding Techniques Designed for UWB MISO Systems Based on Wavelet Transform
    Zaki, Amira Ibrahim
    Badran, Ehab F.
    El-Khamy, Said E.
    PLOS ONE, 2016, 11 (12):
  • [46] Laboratory comparison of two novel humidification techniques
    Branson, R
    Campbell, R
    Johannigman, J
    Frame, S
    CRITICAL CARE MEDICINE, 1999, 27 (01) : A71 - A71
  • [47] Post-pass compaction techniques
    De Bus, B
    Kästner, D
    Chanet, D
    Van Put, L
    De Sutter, B
    COMMUNICATIONS OF THE ACM, 2003, 46 (08) : 41 - 46
  • [48] Analysis Method for Time-Space Sequences by a Novel Neural Network
    Takizawa, Yumi
    Yatano, Saki
    Fukasawa, Atsushi
    MATHEMATICAL METHODS, COMPUTATIONAL TECHNIQUES, NON-LINEAR SYSTEMS, INTELLIGENT SYSTEMS, 2008, : 326 - +
  • [49] Novel Applications of Microextraction Techniques Focused on Biological and Forensic Analyses
    D'Ovidio, Cristian
    Bonelli, Martina
    Rosato, Enrica
    Tartaglia, Angela
    Ulusoy, Halil Ibrahim
    Samanidou, Victoria
    Furton, Kenneth G.
    Kabir, Abuzar
    Ali, Imran
    Savini, Fabio
    Locatelli, Marcello
    de Grazia, Ugo
    SEPARATIONS, 2022, 9 (01)
  • [50] A Diagnosis Algorithm for Extreme Space Compaction
    Holst, Stefan
    Wunderlich, Hans-Joachim
    DATE: 2009 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, VOLS 1-3, 2009, : 1355 - 1360