Similarity-Based Machine Learning Model for Predicting the Metabolic Pathways of Compounds

被引:51
|
作者
Jia, Yanjuan [1 ]
Zhao, Ran [1 ]
Chen, Lei [1 ]
机构
[1] Shanghai Maritime Univ, Coll Informat Engn, Shanghai 201306, Peoples R China
基金
上海市自然科学基金;
关键词
Compounds; Feature extraction; Biochemistry; Machine learning; Radio frequency; Classification algorithms; Predictive models; Metabolic pathway; chemical-chemical association; random forest; NETWORKS; STITCH;
D O I
10.1109/ACCESS.2020.3009439
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Metabolic pathways refer to the continuous chemical reactions in the metabolic process in vivo. Compounds are the major participant for most metabolic pathways. It is essential to determine which compounds can constitute a metabolic pathway. This problem can be converted to the identification of the metabolic pathways of compounds. Although traditional experiments can provide solid results, they are always of low efficiency and high cost. To date, several machine leaning models have been proposed to address this problem. However, almost all models only identified metabolic pathway types of compounds rather than actual metabolic pathways. This study proposed a novel model for predicting actual metabolic pathways for given compounds. The pairs of compounds and metabolic pathways were termed as samples, thereby modeling a binary classification problem. With the concept of "similarity", each sample was represented by seven features, extracted from seven associations of compounds, which measure compound linkages from different aspects. The model adopted random forest as the classification algorithm. Two types of ten-fold cross-validation were adopted to evaluate the performance of the model, indicating its utility. A feature analysis was also performed to determine which compound association was highly related to the identification of metabolic pathways of compounds.
引用
下载
收藏
页码:130687 / 130696
页数:10
相关论文
共 50 条
  • [1] Similarity-Based Machine Learning for Small Data Sets: Predicting Biolubricant Base Oil Viscosities
    Kim, Jae Young
    Khan, Salman A.
    Vlachos, Dionisios G.
    Journal of Physical Chemistry B, 2024, 128 (48): : 11963 - 11970
  • [2] Similarity-based machine learning methods for predicting drug-target interactions: a brief review
    Ding, Hao
    Takigawa, Ichigaku
    Mamitsuka, Hiroshi
    Zhu, Shanfeng
    BRIEFINGS IN BIOINFORMATICS, 2014, 15 (05) : 734 - 747
  • [3] A phrase similarity-based model for statistical machine translation
    He, Zhongjun
    Liu, Qun
    Lin, Shouxun
    Gaojishu Tongxin/Chinese High Technology Letters, 2009, 19 (04): : 337 - 341
  • [4] Predicting metabolic pathways of plant enzymes without using sequence similarity: Models from machine learning
    Almeida, Rodrigo de Oliveira
    Valente, Guilherme Targino
    PLANT GENOME, 2020, 13 (03):
  • [5] SimLL: Similarity-Based Logic Locking Against Machine Learning Attacks
    Chowdhury, Subhajit Dutta
    Yang, Kaixin
    Nuzzo, Pierluigi
    2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
  • [6] Similarity-based model for transliteration
    Faculty of Engineering, University of Tokushima, 2-1 Minamijosanjima, Tokushima 770-8506, Japan
    不详
    不详
    Lect. Notes Electr. Eng., 2009, (195-206):
  • [7] Similarity-based active learning methods
    Sui Q.
    Ghosh S.K.
    Expert Systems with Applications, 2024, 251
  • [8] A similarity-based approach to relevance learning
    Cöster, R
    Asker, L
    ECAI 2000: 14TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2000, 54 : 276 - 280
  • [9] Prediction of peptide hormones using an ensemble of machine learning and similarity-based methods
    Kaur, Dashleen
    Arora, Akanksha
    Vigneshwar, Palani
    Raghava, Gajendra P. S.
    PROTEOMICS, 2024,
  • [10] Predicting Remaining Useful Life with Similarity-Based Priors
    Soons, Youri
    Dijkman, Remco
    Jilderda, Maurice
    Duivesteijn, Wouter
    ADVANCES IN INTELLIGENT DATA ANALYSIS XVIII, IDA 2020, 2020, 12080 : 483 - 495