Comparative analysis and prediction of nucleosome positioning using integrative feature representation and machine learning algorithms

被引:6
|
作者
Han, Guo-Sheng [1 ,2 ,3 ]
Li, Qi [1 ,2 ,3 ]
Li, Ying [1 ,2 ,3 ]
机构
[1] Xiangtan Univ, Dept Math & Computat Sci, Xiangtan 411105, Hunan, Peoples R China
[2] Xiangtan Univ, Key Lab Intelligent Comp & Informat Proc, Minist Educ, Xiangtan 411105, Hunan, Peoples R China
[3] Xiangtan Univ, Hunan Key Lab Computat & Simulat Sci & Engn, Xiangtan 411105, Hunan, Peoples R China
关键词
Nucleosome classification; Frequency chaos game representation; Support vector machine; Extreme learning machine; Extreme gradient boosting; Convolutional neural networks; CHAOS GAME REPRESENTATION; K-TUPLE; HIGH-RESOLUTION; IDENTIFICATION; OCCUPANCY; SEQUENCES; PSEKNC;
D O I
10.1186/s12859-021-04006-w
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background Nucleosome plays an important role in the process of genome expression, DNA replication, DNA repair and transcription. Therefore, the research of nucleosome positioning has invariably received extensive attention. Considering the diversity of DNA sequence representation methods, we tried to integrate multiple features to analyze its effect in the process of nucleosome positioning analysis. This process can also deepen our understanding of the theoretical analysis of nucleosome positioning. Results Here, we not only used frequency chaos game representation (FCGR) to construct DNA sequence features, but also integrated it with other features and adopted the principal component analysis (PCA) algorithm. Simultaneously, support vector machine (SVM), extreme learning machine (ELM), extreme gradient boosting (XGBoost), multilayer perceptron (MLP) and convolutional neural networks (CNN) are used as predictors for nucleosome positioning prediction analysis, respectively. The integrated feature vector prediction quality is significantly superior to a single feature. After using principal component analysis (PCA) to reduce the feature dimension, the prediction quality of H. sapiens dataset has been significantly improved. Conclusions Comparative analysis and prediction on H. sapiens, C. elegans, D. melanogaster and S. cerevisiae datasets, demonstrate that the application of FCGR to nucleosome positioning is feasible, and we also found that integrative feature representation would be better.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] Prediction and Feature Importance of Earth Pressure in Shields Using Machine Learning Algorithms
    Hongyu Huang
    Lipeng Liu
    Ruilang Cao
    Yuxin Cao
    KSCE Journal of Civil Engineering, 2023, 27 : 862 - 877
  • [22] Prediction and Feature Importance of Earth Pressure in Shields Using Machine Learning Algorithms
    Huang, Hongyu
    Liu, Lipeng
    Cao, Ruilang
    Cao, Yuxin
    KSCE JOURNAL OF CIVIL ENGINEERING, 2023, 27 (02) : 862 - 877
  • [23] Prediction and feature selection of low birth weight using machine learning algorithms
    Reza, Tasneem Binte
    Salma, Nahid
    JOURNAL OF HEALTH POPULATION AND NUTRITION, 2024, 43 (01)
  • [24] Sarcopenia risk prediction and feature selection by using quantum machine learning algorithms
    Ullah, Ubaid
    Maheshwari, Danyal
    Castillo Olea, Cristian
    Zapirain, Begonya Garcia
    QUANTUM MACHINE INTELLIGENCE, 2024, 6 (02)
  • [25] Comparative Analysis on the Prediction of Road Accident Severity Using Machine Learning Algorithms<bold> </bold>
    Kushwaha, Manoj
    Abirami, M. S.
    MICRO-ELECTRONICS AND TELECOMMUNICATION ENGINEERING, ICMETE 2021, 2022, 373 : 269 - 280
  • [26] Comparative Crime Analysis and Prediction Using Machine Learning Algorithms: Assessing the Tools and Addressing the Threats
    Umeike, Robinson
    2023 IEEE 35TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2023, : 135 - 142
  • [27] Model Optimization Analysis of Customer Churn Prediction Using Machine Learning Algorithms with Focus on Feature Reductions
    Mirabdolbaghi, Seyed Mohammad Sina
    Amiri, Babak
    DISCRETE DYNAMICS IN NATURE AND SOCIETY, 2022, 2022
  • [28] Comparative Analysis of Diabetes Prediction Using Machine Learning
    David, S. Alex
    Varsha, V.
    Ravali, Y.
    Saranya, N. Naga Amrutha
    SOFT COMPUTING FOR SECURITY APPLICATIONS, ICSCS 2022, 2023, 1428 : 155 - 163
  • [29] Software Defect Prediction Analysis Using Machine Learning Algorithms
    Singh, Praman Deep
    Chug, Anuradha
    PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE AND ENGINEERING (CONFLUENCE 2017), 2017, : 775 - 781
  • [30] PREDICTION OF CRIME RATE ANALYSIS USING MACHINE LEARNING ALGORITHMS
    Shaik, Amjan
    Anisha, N. Satya
    Reddy, G. Vasanthi
    Reddy, D. Bala Cyril
    Sree, D. Keerthi
    Ali, Shaik
    INTERNATIONAL JOURNAL OF EARLY CHILDHOOD SPECIAL EDUCATION, 2022, 14 (05) : 1554 - 1563