Predicting transcriptional responses to heat and drought stress from genomic features using a machine learning approach in rice

被引:3
|
作者
Smet, Dajo [1 ,2 ]
Opdebeeck, Helder [1 ,2 ]
Vandepoele, Klaas [1 ,2 ,3 ]
机构
[1] Univ Ghent, Dept Plant Biotechnol & Bioinformat, Ghent, Belgium
[2] VIB, Ctr Plant Syst Biol, Ghent, Belgium
[3] Univ Ghent, Bioinformat Inst Ghent, Ghent, Belgium
来源
关键词
rice; regulatory elements; regulation of heat stress; regulation of drought stress; machine learning interpretation; GENE-EXPRESSION; ARABIDOPSIS; NETWORKS; E2F;
D O I
10.3389/fpls.2023.1212073
中图分类号
Q94 [植物学];
学科分类号
071001 ;
摘要
Plants have evolved various mechanisms to adapt to adverse environmental stresses, such as the modulation of gene expression. Expression of stress-responsive genes is controlled by specific regulators, including transcription factors (TFs), that bind to sequence-specific binding sites, representing key components of cis-regulatory elements and regulatory networks. Our understanding of the underlying regulatory code remains, however, incomplete. Recent studies have shown that, by training machine learning (ML) algorithms on genomic sequence features, it is possible to predict which genes will transcriptionally respond to a specific stress. By identifying the most important features for gene expression prediction, these trained ML models allow, in theory, to further elucidate the regulatory code underlying the transcriptional response to abiotic stress. Here, we trained random forest ML models to predict gene expression in rice (Oryza sativa) in response to heat or drought stress. Apart from thoroughly assessing model performance and robustness across various input training data, the importance of promoter and gene body sequence features to train ML models was evaluated. The use of enriched promoter oligomers, complementing known TF binding sites, allowed us to gain novel insights in DNA motifs contributing to the stress regulatory code. By comparing genomic feature importance scores for drought and heat stress over time, general and stress-specific genomic features contributing to the performance of the learned models and their temporal variation were identified. This study provides a solid foundation to build and interpret ML models accurately predicting transcriptional responses and enables novel insights in biological sequence features that are important for abiotic stress responses.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] Uncovering the Genomic Regions Associated with Yield Maintenance in Rice Under Drought Stress Using an Integrated Meta-Analysis Approach
    Parisa Daryani
    Nazanin Amirbakhtiar
    Jahad Soorni
    Fatemeh Loni
    Hadi Darzi Ramandi
    Zahra-Sadat Shobbar
    Rice, 2024, 17
  • [42] Rice Disease Classification Using Supervised Machine Learning Approach
    Jena, Kalyan Kumar
    Bhoi, Sourav Kumar
    Mohapatra, Debasis
    Mallick, Chittaranjan
    Swain, Prachi
    PROCEEDINGS OF THE 2021 FIFTH INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC 2021), 2021, : 328 - 333
  • [43] A machine vision approach for classification the rice varieties using statistical features
    Qadri, Salman
    Aslam, Tanveer
    Nawaz, Syed Ali
    Saher, Najia
    Razzaq, Abdul
    Ur Rehman, Muzammil
    Ahmad, Nazir
    Shahzad, Faisal
    Furqan Qadri, Syed
    INTERNATIONAL JOURNAL OF FOOD PROPERTIES, 2021, 24 (01) : 1615 - 1630
  • [44] Predicting executive functioning from walking features in Parkinson's disease using machine learning
    Piet, Artur
    Geritz, Johanna
    Garcia, Pascal
    Irsfeld, Mona
    Li, Frederic
    Huang, Xinyu
    Irshad, Muhammad Tausif
    Welzel, Julius
    Hansen, Clint
    Maetzler, Walter
    Grzegorzek, Marcin
    Bunzeck, Nico
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [45] Predicting dynamic stability from static features in power grid models using machine learning
    Titz, Maurizio
    Kaiser, Franz
    Kruse, Johannes
    Witthaut, Dirk
    CHAOS, 2024, 34 (01)
  • [46] Predicting Rice Heading Date Using an Integrated Approach Combining a Machine Learning Method and a Crop Growth Model
    Chen, Tai-Shen
    Aoike, Toru
    Yamasaki, Masanori
    Kajiya-Kanegae, Hiromi
    Iwata, Hiroyoshi
    FRONTIERS IN GENETICS, 2020, 11
  • [47] Predicting the performance of anaerobic digestion using machine learning algorithms and genomic data
    Long, Fei
    Wang, Luguang
    Cai, Wenfang
    Lesnik, Keaton
    Liu, Hong
    WATER RESEARCH, 2021, 199 (199)
  • [48] A machine-learning approach for predicting palmitoylation sites from integrated sequence-based features
    Li, Liqi
    Luo, Qifa
    Xiao, Weidong
    Li, Jinhui
    Zhou, Shiwen
    Li, Yongsheng
    Zheng, Xiaoqi
    Yang, Hua
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2017, 15 (01)
  • [49] Predicting Drug Resistance in Mycobacterium tuberculosis: A Machine Learning Approach to Genomic Mutation Analysis
    Paredes-Gutierrez, Guillermo
    Perea-Jacobo, Ricardo
    Acosta-Mesa, Hector-Gabriel
    Mezura-Montes, Efren
    Morales Reyes, Jose Luis
    Zenteno-Cuevas, Roberto
    Guerrero-Chevannier, Miguel-angel
    Muniz-Salazar, Raquel
    Flores, Dora-Luz
    DIAGNOSTICS, 2025, 15 (03)
  • [50] Predicting the inpatient hospital cost using a machine learning approach
    Kulkarni, Suraj
    Ambekar, Suhas Suresh
    Hudnurkar, Manoj
    INTERNATIONAL JOURNAL OF INNOVATION SCIENCE, 2021, 13 (01) : 87 - 104