Genie: the first open-source ISO/IEC encoder for genomic data

被引:0
|
作者
Muentefering, Fabian [1 ]
Adhisantoso, Yeremia Gunawan [1 ]
Chandak, Shubham [2 ]
Ostermann, Joern [1 ]
Hernaez, Mikel [3 ]
Voges, Jan [1 ]
机构
[1] Leibniz Univ Hannover, Inst Informationsverarbeitung TNT, Appelstr 9A, D-30167 Hannover, Germany
[2] Stanford Univ, Dept Elect Engn, 350 Jane Stanford Way, Stanford, CA 94305 USA
[3] Univ Navarra, Ctr Appl Med Res CIMA, Ave Pio XII 55, Pamplona 31008, Navarra, Spain
关键词
COMPRESSION; FORMAT;
D O I
10.1038/s42003-024-06249-8
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
For the last two decades, the amount of genomic data produced by scientific and medical applications has been growing at a rapid pace. To enable software solutions that analyze, process, and transmit these data in an efficient and interoperable way, ISO and IEC released the first version of the compression standard MPEG-G in 2019. However, non-proprietary implementations of the standard are not openly available so far, limiting fair scientific assessment of the standard and, therefore, hindering its broad adoption. In this paper, we present Genie, to the best of our knowledge the first open-source encoder that compresses genomic data according to the MPEG-G standard. We demonstrate that Genie reaches state-of-the-art compression ratios while offering interoperability with any other standard-compliant decoder independent from its manufacturer. Finally, the ISO/IEC ecosystem ensures the long-term sustainability and decodability of the compressed data through the ISO/IEC-supported reference decoder. Genie, an open-source encoder compliant with ISO/IEC's MPEG-G standard, delivers state-of-the-art genomic data compression ratios and guarantees long-term data sustainability and decodability through interoperability.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] The effect of ISO/IEC 27001 standard over open-source intelligence
    Qusef, Abdallah
    Alkilani, Hamzeh
    [J]. PEERJ COMPUTER SCIENCE, 2022, 8
  • [2] The effect of ISO/IEC 27001 standard over open-source intelligence
    Qusef, Abdallah
    Alkilani, Hamzeh
    [J]. PeerJ Computer Science, 2022, 8
  • [3] An Introduction to MPEG-G: The First Open ISO/IEC Standard for the Compression and Exchange of Genomic Sequencing Data
    Voges, Jan
    Hernaez, Mikel
    Mattavelli, Marco
    Ostermann, Joern
    [J]. PROCEEDINGS OF THE IEEE, 2021, 109 (09) : 1607 - 1622
  • [4] Kvazaar: Open-Source HEVC/H.265 Encoder
    Viitanen, Marko
    Koivula, Ari
    Lemmetti, Ari
    Yla-Outinen, Arttu
    Vanne, Jamb
    Hamalainen, Timo D.
    [J]. MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE, 2016, : 1179 - 1182
  • [5] Open-Source Data and the Study of Homicide
    Parkin, William S.
    Gruenewald, Jeff
    [J]. JOURNAL OF INTERPERSONAL VIOLENCE, 2017, 32 (18) : 2693 - 2723
  • [6] Open-source tools for data mining
    Zupan, Blaz
    Demsar, Janez
    [J]. CLINICS IN LABORATORY MEDICINE, 2008, 28 (01) : 37 - +
  • [7] Kvazaar 2.0: Fast and Efficient Open-Source HEVC Inter Encoder
    Lemmetti, Ari
    Viitanen, Marko
    Mercat, Alexandre
    Vanne, Jarno
    [J]. MMSYS'20: PROCEEDINGS OF THE 2020 MULTIMEDIA SYSTEMS CONFERENCE, 2020, : 237 - 242
  • [8] Phytogeographical regions of Egypt: first open-source geospatial data and its applications
    Keshta, Amr E.
    [J]. BMC RESEARCH NOTES, 2023, 16 (01)
  • [9] Phytogeographical regions of Egypt: first open-source geospatial data and its applications
    Amr E. Keshta
    [J]. BMC Research Notes, 16
  • [10] ZANARDI: an open-source pipeline for multiple-species genomic analysis of SNP array data
    Marras, Gabriele
    Rossoni, Attilio
    Schwarzenbacher, Hermann
    Biffani, Stefano
    Biscarini, Filippo
    Nicolazzi, Ezequiel L.
    [J]. ANIMAL GENETICS, 2017, 48 (01) : 121 - 121