A self-supervised deep learning method for data-efficient training in genomics

被引:0
|
作者
Hüseyin Anil Gündüz
Martin Binder
Xiao-Yin To
René Mreches
Bernd Bischl
Alice C. McHardy
Philipp C. Münch
Mina Rezaei
机构
[1] LMU Munich,Department of Statistics
[2] Munich Center for Machine Learning,Department for Computational Biology of Infection Research
[3] Helmholtz Center for Infection Research,Braunschweig Integrated Centre of Systems Biology (BRICS)
[4] Technische Universität Braunschweig,German Center for Infection Research (DZIF)
[5] partner site Hannover Braunschweig,Department of Biostatistics
[6] Harvard School of Public Health,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Deep learning in bioinformatics is often limited to problems where extensive amounts of labeled data are available for supervised classification. By exploiting unlabeled data, self-supervised learning techniques can improve the performance of machine learning models in the presence of limited labeled data. Although many self-supervised learning methods have been suggested before, they have failed to exploit the unique characteristics of genomic data. Therefore, we introduce Self-GenomeNet, a self-supervised learning technique that is custom-tailored for genomic data. Self-GenomeNet leverages reverse-complement sequences and effectively learns short- and long-term dependencies by predicting targets of different lengths. Self-GenomeNet performs better than other self-supervised methods in data-scarce genomic tasks and outperforms standard supervised training with ~10 times fewer labeled training data. Furthermore, the learned representations generalize well to new datasets and tasks. These findings suggest that Self-GenomeNet is well suited for large-scale, unlabeled genomic datasets and could substantially improve the performance of genomic models.
引用
收藏
相关论文
共 50 条
  • [31] Deep echocardiography: data-efficient supervised and semi-supervised deep learning towards automated diagnosis of cardiac disease
    Madani, Ali
    Ong, Jia Rui
    Tibrewal, Anshul
    Mofrad, Mohammad R. K.
    NPJ DIGITAL MEDICINE, 2018, 1
  • [32] Deep reinforcement learning for data-efficient weakly supervised business process anomaly detection
    Eman Abd Elaziz
    Radwa Fathalla
    Mohamed Shaheen
    Journal of Big Data, 10
  • [33] Deep reinforcement learning for data-efficient weakly supervised business process anomaly detection
    Elaziz, Eman Abd
    Fathalla, Radwa
    Shaheen, Mohamed
    JOURNAL OF BIG DATA, 2023, 10 (01)
  • [34] An Assessment of Self-supervised Learning for Data Efficient Potato Instance Segmentation
    Hurst, Bradley
    Bellotto, Nicola
    Bosilj, Petra
    TOWARDS AUTONOMOUS ROBOTIC SYSTEMS, TAROS 2023, 2023, 14136 : 267 - 278
  • [35] Deep active sampling with self-supervised learning
    Shi, Haochen
    Zhou, Hui
    FRONTIERS OF COMPUTER SCIENCE, 2023, 17 (04)
  • [36] Self-Supervised Deep Metric Learning for Pointsets
    Arsomngern, Pattaramanee
    Long, Cheng
    Suwajanakorn, Supasorn
    Nutanong, Sarana
    2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 2171 - 2176
  • [37] Deep Metric Learning with Self-Supervised Ranking
    Fu, Zheren
    Li, Yan
    Mao, Zhendong
    Wang, Quan
    Zhang, Yongdong
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1370 - 1378
  • [38] Deep active sampling with self-supervised learning
    SHI Haochen
    ZHOU Hui
    Frontiers of Computer Science, 2023, 17 (04)
  • [39] Efficient DDPG via the Self-Supervised Method
    Zhang, Guanghao
    Chen, Hongliang
    Li, Jianxun
    PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 4636 - 4642
  • [40] A Self-Supervised Deep Learning Method for Seismic Data Deblending Using a Blind-Trace Network
    Wang, Shirui
    Hu, Wenyi
    Yuan, Pengyu
    Wu, Xuqing
    Zhang, Qunshan
    Nadukandi, Prashanth
    Botero, German Ocampo
    Chen, Jiefu
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (07) : 3405 - 3414