Identification of small open reading frames in plant lncRNA using class-imbalance learning

被引:6
|
作者
Zhao, Siyuan [1 ]
Meng, Jun [1 ]
Wekesa, Jael Sanyanda [2 ]
Luan, Yushi [3 ]
机构
[1] Dalian Univ Technol, Sch Comp Sci & Technol, Dalian 116024, Liaoning, Peoples R China
[2] Jomo Kenyatta Univ Agr & Technol, Dept Informat Technol, Nairobi 6200000200, Kenya
[3] Dalian Univ Technol, Sch Bioengn, Dalian 116024, Liaoning, Peoples R China
关键词
Class-imbalance learning; Feature selection; Hybrid resampling; Ensemble learning; sORFs; lncRNA; SMOTE;
D O I
10.1016/j.compbiomed.2023.106773
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recently, small open reading frames (sORFs) in long noncoding RNA (lncRNA) have been demonstrated to encode small peptides that can help study the mechanisms of growth and development in organisms. Since machine learning-based computational methods are less costly compared with biological experiments, they can be used to identify sORFs and provide a basis for biological experiments. However, few computational methods and data resources have been exploited for identifying sORFs in plant lncRNA. Besides, machine learning models produce underperforming classifiers when faced with a class-imbalance problem. In this study, an alternative method called SMOTE based on weighted cosine distance (WCDSMOTE) which enables interaction with feature selection is put forward to synthesize minority class samples and weighted edited nearest neighbor (WENN) is applied to clean up majority class samples, thus, hybrid sampling WCDSMOTE-ENN is proposed to deal with imbalanced datasets with the multi-angle feature. A heterogeneous classifier ensemble is introduced to complete the classification task. Therefore, a novel computational method that is based on class-imbalance learning to identify the sORFs with coding potential in plant lncRNA (sORFplnc) is presented. Experimental results manifest that sORFplnc outperforms existing computational methods in identifying sORFs with coding potential. We anticipate that the proposed work can be a reference for relevant research and contribute to agriculture and biomedicine.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Exploratory Undersampling for Class-Imbalance Learning
    Liu, Xu-Ying
    Wu, Jianxin
    Zhou, Zhi-Hua
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2009, 39 (02): : 539 - 550
  • [2] Trainable Undersampling for Class-Imbalance Learning
    Peng, Minlong
    Zhang, Qi
    Xing, Xiaoyu
    Gui, Tao
    Huang, Xuanjing
    Jiang, Yu-Gang
    Ding, Keyu
    Chen, Zhigang
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 4707 - 4714
  • [3] Class-imbalance Learning based Discriminant Analysis
    Jing, Xiaoyuan
    Lan, Chao
    Li, Min
    Yao, Yongfang
    Zhang, David
    Yang, Jingyu
    2011 FIRST ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR), 2011, : 545 - 549
  • [4] Identification of short open reading frames in plant genomes
    Feng, Yong
    Jiang, Mengyun
    Yu, Weichang
    Zhou, Jiannan
    FRONTIERS IN PLANT SCIENCE, 2023, 14
  • [5] Using Ensembles for Class-Imbalance Problem to Predict Maintainability of Open Source Software
    Malhotra, Ruchika
    Lata, Kusum
    INTERNATIONAL JOURNAL OF RELIABILITY QUALITY AND SAFETY ENGINEERING, 2020, 27 (05)
  • [6] Adaptive Sampling with Optimal Cost for Class-Imbalance Learning
    Peng, Yuxin
    PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2921 - 2927
  • [7] Generating Counterfactual Instances for Explainable Class-Imbalance Learning
    Chen, Zhi
    Duan, Jiang
    Kang, Li
    Xu, Hongyan
    Chen, Rui
    Qiu, Guoping
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (03) : 1130 - 1144
  • [8] Online Anomaly Detection via Class-Imbalance Learning
    Maurya, Chandresh Kumar
    Toshniwal, Durga
    Venkoparao, Gopalan Vijendran
    2015 EIGHTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2015, : 30 - 35
  • [9] Distributed Sparse Class-Imbalance Learning and Its Applications
    Maurya, Chandresh Kumar
    Toshniwal, Durga
    Venkoparao, Gopalan Vijendran
    IEEE TRANSACTIONS ON BIG DATA, 2021, 7 (05) : 832 - 844
  • [10] Exploratory under-sampling for class-imbalance learning
    Liu, Xu-Ying
    Wu, Jianxin
    Zhou, Zhi-Hua
    ICDM 2006: SIXTH INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2006, : 965 - 969