Comparative analysis and prediction of nucleosome positioning using integrative feature representation and machine learning algorithms

被引:6
|
作者
Han, Guo-Sheng [1 ,2 ,3 ]
Li, Qi [1 ,2 ,3 ]
Li, Ying [1 ,2 ,3 ]
机构
[1] Xiangtan Univ, Dept Math & Computat Sci, Xiangtan 411105, Hunan, Peoples R China
[2] Xiangtan Univ, Key Lab Intelligent Comp & Informat Proc, Minist Educ, Xiangtan 411105, Hunan, Peoples R China
[3] Xiangtan Univ, Hunan Key Lab Computat & Simulat Sci & Engn, Xiangtan 411105, Hunan, Peoples R China
关键词
Nucleosome classification; Frequency chaos game representation; Support vector machine; Extreme learning machine; Extreme gradient boosting; Convolutional neural networks; CHAOS GAME REPRESENTATION; K-TUPLE; HIGH-RESOLUTION; IDENTIFICATION; OCCUPANCY; SEQUENCES; PSEKNC;
D O I
10.1186/s12859-021-04006-w
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background Nucleosome plays an important role in the process of genome expression, DNA replication, DNA repair and transcription. Therefore, the research of nucleosome positioning has invariably received extensive attention. Considering the diversity of DNA sequence representation methods, we tried to integrate multiple features to analyze its effect in the process of nucleosome positioning analysis. This process can also deepen our understanding of the theoretical analysis of nucleosome positioning. Results Here, we not only used frequency chaos game representation (FCGR) to construct DNA sequence features, but also integrated it with other features and adopted the principal component analysis (PCA) algorithm. Simultaneously, support vector machine (SVM), extreme learning machine (ELM), extreme gradient boosting (XGBoost), multilayer perceptron (MLP) and convolutional neural networks (CNN) are used as predictors for nucleosome positioning prediction analysis, respectively. The integrated feature vector prediction quality is significantly superior to a single feature. After using principal component analysis (PCA) to reduce the feature dimension, the prediction quality of H. sapiens dataset has been significantly improved. Conclusions Comparative analysis and prediction on H. sapiens, C. elegans, D. melanogaster and S. cerevisiae datasets, demonstrate that the application of FCGR to nucleosome positioning is feasible, and we also found that integrative feature representation would be better.
引用
收藏
页数:23
相关论文
共 50 条
  • [31] The Analysis of Feature Selection with Machine Learning for Indoor Positioning
    Aydin, Hurkan M.
    Ali, Muhammad Ammar
    Soyak, Ece Gelal
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [32] An integrative analysis of nucleosome occupancy and positioning using diverse sequence dependent properties
    Lu, Yun
    Gan, Yanglan
    Guan, Jihong
    Zhou, Shuigeng
    NEUROCOMPUTING, 2016, 206 : 35 - 41
  • [33] A Comparative Study for Breast Cancer Prediction using Machine Learning and Feature Selection
    Dhanya, R.
    Paul, Irene Rose
    Akula, Sai Sindhu
    Sivakumar, Madhumathi
    Nair, Jyothisha J.
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICCS), 2019, : 1049 - 1055
  • [34] DrugMiner: comparative analysis of machine learning algorithms for prediction of potential druggable proteins
    Jamali, Ali Akbar
    Ferdousi, Reza
    Razzaghi, Saeed
    Li, Jiuyong
    Safdari, Reza
    Ebrahimie, Esmaeil
    DRUG DISCOVERY TODAY, 2016, 21 (05) : 718 - 724
  • [35] COVID-19 Prediction Applying Supervised Machine Learning Algorithms with Comparative Analysis Using WEKA
    Villavicencio, Charlyn Nayve
    Macrohon, Julio Jerison Escudero
    Inbaraj, Xavier Alphonse
    Jeng, Jyh-Horng
    Hsieh, Jer-Guang
    ALGORITHMS, 2021, 14 (07)
  • [36] A comparative analysis of bubble point pressure prediction using advanced machine learning algorithms and classical correlations
    Yang, Xi
    Dindoruk, Birol
    Lu, Ligang
    JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2020, 185
  • [37] A Comparative Analysis of the Prediction of Gas Condensate Dew Point Pressure Using Advanced Machine Learning Algorithms
    Lertliangchai, Thitaree
    Dindoruk, Birol
    Lu, Ligang
    Yang, Xi
    Sinha, Utkarsh
    FUELS, 2024, 5 (03): : 548 - 563
  • [38] Comparative Analysis of Machine Learning Algorithms With Advanced Feature Extraction for ECG Signal Classification
    Subba, Tanuja
    Chingtham, Tejbanta
    IEEE ACCESS, 2024, 12 : 57727 - 57740
  • [39] Mortality Prediction using Machine Learning Techniques: Comparative Analysis
    Verma, Akash
    Goyal, Shreya
    Thakur, Shridhar Kumar
    Gupta, Archit
    Gupta, Indrajeet
    PROCEEDINGS OF THE 2019 IEEE 9TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (IACC 2019), 2019, : 230 - 234
  • [40] Comparative analysis of Deep Learning and Machine Learning algorithms for emoji prediction from Arabic text
    Mokhamed, Takua
    Harous, Saad
    Hussein, Nada
    Ismail, Heba
    SOCIAL NETWORK ANALYSIS AND MINING, 2024, 14 (01)