Prediction of Amyloid Proteins Using Embedded Evolutionary & Ensemble Feature Selection Based Descriptors With eXtreme Gradient Boosting Model

被引:19
|
作者
Akbar, Shahid [1 ]
Ali, Hashim [1 ]
Ahmad, Ashfaq [2 ]
Sarker, Mahidur R. R. [3 ]
Saeed, Aamir [4 ]
Salwana, Ely [3 ]
Gul, Sarah [5 ]
Khan, Ahmad [1 ]
Ali, Farman [6 ]
机构
[1] Abdul Wali Khan Univ Mardan, Dept Comp Sci, Mardan 23200, Khyber Pakhtunk, Pakistan
[2] MY Univ, Dept Comp Sci, Islamabad 44000, Pakistan
[3] Univ Kebangsaan Malaysia, Inst IR4 0, Bangi 43600, Malaysia
[4] Univ Engn & Technol Peshawar, Dept Comp Sci & IT, Peshawar 25000, Pakistan
[5] Int Islamic Univ Islamabad, Dept Biol Sci, FBAS, Islamabad 44000, Pakistan
[6] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
关键词
Amyloid proteins; K-separated bigrams; eXtreme gradient boosting; filter-position specific scoring matrix; ensemble feature selection; classification; OVERSAMPLING TECHNIQUE; DIPEPTIDE COMPOSITION; IDENTIFICATION; SERVER; SMOTE;
D O I
10.1109/ACCESS.2023.3268523
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Amyloid proteins (AMYs) are usually an aggregate of insoluble fibrous that have major pathogenic effects on various tissues. However, its abnormal deposition may lead to several diseases i.e., Parkinson's, Alzheimer's, and type 2 diabetes. In addition, AMYs form amyloid aggregates when they are in a misfolded state. Therefore, it is crucial to accurately predict AMYs and their pathogenic characteristics. Various computational predictors have been presented for the accurate prediction of AMYs. Although, the effectiveness of these predictors is unsatisfactory due to their low generalization abilities and high training cost. In this attempt, we proposed an intelligent computational predictor for the accurate prediction of AMYs. The novel embedded evolutionary features are gathered using K-separated bigrams, and the Filter method into the evolutionary descriptors. Moreover, DDE-based enhanced frequency coupling information are gathered from the Amyloid sequences. Additionally, a multi-model vector is obtained by combining the features of the applied formulation techniques. To reduce the computational cost of the proposed model, the eXtreme Gradient Boosting-Recursive Feature Elimination (XGB-RFE) based high-ranked features are selected from the heterogeneous vector. In the next part, the optimal features are evaluated via several learners, i.e., XGBoost (XGB), Light Gradient Boosted Machine (LGBM), Support Vector Machine (SVM), Adaboost (ada), and Extra Trees classifier (ETC),. The proposed model reported an improved predictive prediction accuracy of 93.10% using training sequences and 89.67% using independent sequences, respectively. Which is similar to 4% higher training accuracy than existing predictors. It is anticipated that our predictive approach will be useful for scientists and might play a key role in drug development and academic research.
引用
收藏
页码:39024 / 39036
页数:13
相关论文
共 50 条
  • [31] Liver Cancer Classification Using Random Forest and Extreme Gradient Boosting (XGBoost) with Genetic Algorithm as Feature Selection
    Desdhanty, Vabiyana Safira
    Rustam, Zuherman
    2021 INTERNATIONAL CONFERENCE ON DECISION AID SCIENCES AND APPLICATION (DASA), 2021,
  • [32] A Network Intrusion Detection Approach Using Extreme Gradient Boosting with Max-Depth Optimization and Feature Selection
    Hassan G.M.
    Gumaei A.
    Alanazi A.
    Alzanin S.M.
    International Journal of Interactive Mobile Technologies, 2023, 17 (15) : 120 - 134
  • [33] Supply-demand prediction of DIDI based on points of interests selection in extreme gradient boosting algorithm
    Tian Y.
    Li Z.
    Zhang Y.
    Wu Q.
    Revue d'Intelligence Artificielle, 2020, 34 (01) : 111 - 116
  • [34] An ensemble learning model for asphalt pavement performance prediction based on gradient boosting decision tree
    Guo, Runhua
    Fu, Donglei
    Sollazzo, Giuseppe
    INTERNATIONAL JOURNAL OF PAVEMENT ENGINEERING, 2022, 23 (10) : 3633 - 3646
  • [35] Prediction and classification of solar photovoltaic power generation using extreme gradient boosting regression model
    Rinesh, S.
    Deepa, S.
    Nandan, R. T.
    Sachin, R. S.
    Thamil, S., V
    Akash, R.
    Arun, M.
    Prajitha, C.
    Kumar, A. P. Senthil
    INTERNATIONAL JOURNAL OF LOW-CARBON TECHNOLOGIES, 2024, 19 : 2420 - 2430
  • [36] Research on Provincial-Level Soil Moisture Prediction Based on Extreme Gradient Boosting Model
    Ren, Yifang
    Ling, Fenghua
    Wang, Yong
    AGRICULTURE-BASEL, 2023, 13 (05):
  • [37] An energy consumption prediction model for electric buses based on extreme gradient boosting fusion algorithm
    Kang, Yiting
    Wei, Jianshu
    Liu, Zhihua
    Xiao, Ke
    INTERNATIONAL JOURNAL OF GREEN ENERGY, 2025,
  • [38] Time series–based prediction of antibiotic degradation via photocatalysis using ensemble gradient boosting
    Sheetal Sethi
    Amit Dhir
    Vinay Arora
    Environmental Science and Pollution Research, 2024, 31 : 24315 - 24328
  • [39] Constructing response model using ensemble based on feature subset selection
    Yu, EZ
    Cho, SZ
    EXPERT SYSTEMS WITH APPLICATIONS, 2006, 30 (02) : 352 - 360
  • [40] Target-DBPPred: An intelligent model for prediction of DNA-binding proteins using discrete wavelet transform based compression and light eXtreme gradient boosting
    Ali, Farman
    Kumar, Harish
    Patil, Shruti
    Kotecha, Ketan
    Banjar, Ameen
    Daud, Ali
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 145