A Novel Machine-Learning Approach to Predict Stress-Responsive Genes in Arabidopsis

被引:2
|
作者
Nazari, Leyla [1 ]
Ghotbi, Vida [2 ]
Nadimi, Mohammad [3 ]
Paliwal, Jitendra [3 ]
机构
[1] Agr Res Educ & Extens Org AREEO, Fars Agr & Nat Resources Res & Educ Ctr, Crop & Hort Sci Res Dept, Shiraz 7155863511, Iran
[2] Agr Res Educ & Extens Org AREEO, Seed & Plant Improvement Inst, Karaj 3135933151, Iran
[3] Univ Manitoba, Dept Biosyst Engn, Winnipeg, MB R3T 5V6, Canada
关键词
LASSO; information gain; ReliefF; classifiers; random forest; SELECTION; TRANSCRIPTOMICS; EXPRESSION; TIME;
D O I
10.3390/a16090407
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This study proposes a hybrid gene selection method to identify and predict key genes in Arabidopsis associated with various stresses (including salt, heat, cold, high-light, and flagellin), aiming to enhance crop tolerance. An open-source microarray dataset (GSE41935) comprising 207 samples and 30,380 genes was analyzed using several machine learning tools including the synthetic minority oversampling technique (SMOTE), information gain (IG), ReliefF, and least absolute shrinkage and selection operator (LASSO), along with various classifiers (BayesNet, logistic, multilayer perceptron, sequential minimal optimization (SMO), and random forest). We identified 439 differentially expressed genes (DEGs), of which only three were down-regulated (AT3G20810, AT1G31680, and AT1G30250). The performance of the top 20 genes selected by IG and ReliefF was evaluated using the classifiers mentioned above to classify stressed versus non-stressed samples. The random forest algorithm outperformed other algorithms with an accuracy of 97.91% and 98.51% for IG and ReliefF, respectively. Additionally, 42 genes were identified from all 30,380 genes using LASSO regression. The top 20 genes for each feature selection were analyzed to determine three common genes (AT5G44050, AT2G47180, and AT1G70700), which formed a three-gene signature. The efficiency of these three genes was evaluated using random forest and XGBoost algorithms. Further validation was performed using an independent RNA_seq dataset and random forest. These gene signatures can be exploited in plant breeding to improve stress tolerance in a variety of crops.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Cloning of the oxidative stress-responsive genes in Caenorhabditis elegans
    Yanase, S
    Ishi, N
    JOURNAL OF RADIATION RESEARCH, 1999, 40 (01) : 39 - 47
  • [22] Annotation of Stress-Responsive Candidate Genes in Peanut ESTs
    Amar Ranjan
    Archana Kumari
    Dev Mani Pandey
    Interdisciplinary Sciences: Computational Life Sciences, 2015, 7 : 143 - 151
  • [23] NOVEL MACHINE-LEARNING ANALYSIS TO PREDICT OUTCOMES DURING INPATIENT REHABILITATION
    Wu, B.
    Upadhyaya, P.
    Savitz, S.
    Jiang, X.
    Shams, S.
    INTERNATIONAL JOURNAL OF STROKE, 2021, 16 (2_SUPPL) : 38 - 38
  • [24] Machine-learning approach to predict work hardening behavior of pearlitic steel
    Qiao, Ling
    Liu, Yong
    Zhu, Jingchuan
    Wang, Zibo
    MATERIALS LETTERS, 2021, 289
  • [25] Using a Novel Machine-Learning Algorithm as an Auxiliary Approach to Predict the Transfusion Volume in Mitral Valve Surgery
    Sang, Ruirui
    Wu, Qianyi
    Liu, Shun
    Wu, Kai
    Nie, Yining
    Xia, Xingqiu
    Ren, He
    Jiang, Mi
    Tu, Guowei
    Rong, Ruiming
    Wei, Lai
    Zhou, Rong
    HEART SURGERY FORUM, 2024, 27 (06): : E645 - E654
  • [26] Identification of novel stress-responsive transcription factor genes in rice by cDNA array analysis
    Wu, Cong-Qing
    Hu, Hong-Hong
    Zeng, Ya
    Liang, Da-Cheng
    Xie, Ka-Bin
    Zhang, Jian-Wei
    Chu, Zhao-Hui
    Xiong, Li-Zhong
    JOURNAL OF INTEGRATIVE PLANT BIOLOGY, 2006, 48 (10) : 1216 - 1224
  • [27] Characterization of stress-responsive CIPK genes in rice for stress tolerance improvement
    Xiang, Yong
    Huang, Yuemin
    Xiong, Lizhong
    PLANT PHYSIOLOGY, 2007, 144 (03) : 1416 - 1428
  • [28] Analysis of G-Quadruplex-Forming Sequences in Drought Stress-Responsive Genes, and Synthesis Genes of Phenolic Compounds in Arabidopsis thaliana
    Pecinka, Petr
    Bohalova, Natalia
    Volna, Adriana
    Kundratova, Kristyna
    Brazda, Vaclav
    Bartas, Martin
    LIFE-BASEL, 2023, 13 (01):
  • [29] Novel Maize NAC Transcriptional Repressor ZmNAC071 Confers Enhanced Sensitivity to ABA and Osmotic Stress by Downregulating Stress-Responsive Genes in Transgenic Arabidopsis
    He, Lin
    Bian, Jing
    Xu, Jingyu
    Yang, Kejun
    JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY, 2019, 67 (32) : 8905 - 8918
  • [30] In-depth investigation on abiotic stress-responsive differentially expressed genes in Arabidopsis roots through GEO database
    Guo, Meili
    Liu, Xin
    Wang, Jiahui
    Li, Lei
    Jiang, Yusu
    Yu, Xuejuan
    Meng, Tao
    JOURNAL OF PLANT INTERACTIONS, 2020, 15 (01) : 294 - 302