Evaluating the chaos game representation of proteins for applications in machine learning models: prediction of antibody affinity and specificity as a case study

被引:0
|
作者
Arsiccio, Andrea [1 ]
Stratta, Lorenzo [2 ]
Menzen, Tim [1 ]
机构
[1] Coriolis Pharm, Fraunhoferstr 18 B, D-82152 Martinsried, Germany
[2] Politecn Torino, Dept Appl Sci & Technol, Mol Engn Lab molE, 24 Corso Duca Abruzzi, IT-10129 Turin, Italy
关键词
Chaos game; Artificial intelligence; Neural networks; Proteins; Antibodies;
D O I
10.1007/s00894-023-05777-0
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
ContextMachine learning techniques are becoming increasingly important in the selection and optimization of therapeutic molecules, as well as for the selection of formulation components and the prediction of long-term stability. Compared to first-principle models, machine learning techniques are easier to implement, and can identify correlations that would be hard to describe at a mechanistic level, but strongly rely on high-quality input training data. Here, we evaluate the potential of the "chaos game" representation to provide input data for machine learning models. The chaos game is an algorithm originally developed for the production of fractal structures, and later on applied also to the representation of biological sequences, such as genes and proteins. Our results show that the combination of the chaos game representation with convolutional neural networks results in comparable accuracy to other machine learning approaches, thus indicating that chaos game representations could be a valid alternative to existing featurization strategies for machine learning models of biological sequences.MethodsWe implement the chaos game in Python 3.8.10, and use it to produce fractal as well as novel expanding representations of protein sequences. We then feed the resulting images to a convolutional neural network, built in Python 3.8.10, using TensorFlow 2.9.1, Keras 2.9.0, and the scikit-learn 1.1.1 packages. We select as case study a recently published dataset for the antibody emibetuzumab, with the objective of co-optimizing antibodies variants with both high affinity and low non-specific binding.
引用
收藏
页数:15
相关论文
共 47 条
  • [1] Evaluating the chaos game representation of proteins for applications in machine learning models: prediction of antibody affinity and specificity as a case study
    Andrea Arsiccio
    Lorenzo Stratta
    Tim Menzen
    [J]. Journal of Molecular Modeling, 2023, 29
  • [2] Development of machine learning models for prediction of antibody non-specificity
    Sakhnini, Laila
    Lorenzen, Nikolai
    Sormanni, Pietro
    Vendruscolo, Michele
    Granata, Daniele
    [J]. BIOPHYSICAL JOURNAL, 2023, 122 (03) : 463A - 463A
  • [3] Machine Learning Occupancy Prediction Models - A Case Study
    Alfalah, Bashar
    Shahrestani, Mehdi
    Shao, Li
    [J]. ASHRAE TRANSACTIONS 2023, VOL 129, PT 1, 2023, 129 : 694 - 702
  • [4] Co-optimization of therapeutic antibody affinity and specificity using machine learning models that generalize to novel mutational space
    Makowski, Emily K.
    Kinnunen, Patrick C.
    Huang, Jie
    Wu, Lina
    Smith, Matthew D.
    Wang, Tiexin
    Desai, Alec A.
    Streu, Craig N.
    Zhang, Yulei
    Zupancic, Jennifer M.
    Schardt, John S.
    Linderman, Jennifer J.
    Tessier, Peter M.
    [J]. NATURE COMMUNICATIONS, 2022, 13 (01)
  • [5] Co-optimization of therapeutic antibody affinity and specificity using machine learning models that generalize to novel mutational space
    Emily K. Makowski
    Patrick C. Kinnunen
    Jie Huang
    Lina Wu
    Matthew D. Smith
    Tiexin Wang
    Alec A. Desai
    Craig N. Streu
    Yulei Zhang
    Jennifer M. Zupancic
    John S. Schardt
    Jennifer J. Linderman
    Peter M. Tessier
    [J]. Nature Communications, 13
  • [6] Improving Failure Prediction by Ensembling the Decisions of Machine Learning Models: A Case Study
    Campos, Joao R.
    Costa, Ernesto
    Vieira, Marco
    [J]. IEEE ACCESS, 2019, 7 : 177661 - 177674
  • [7] Flood Prediction Using Machine Learning Models: A Case Study of Kebbi State Nigeria
    Lawal, Zaharaddeen Karami
    Yassin, Hayati
    Zakari, Rufai Yusuf
    [J]. 2021 IEEE ASIA-PACIFIC CONFERENCE ON COMPUTER SCIENCE AND DATA ENGINEERING (CSDE), 2021,
  • [8] A Study on Prediction of Size and Morphology of Ag Nanoparticles Using Machine Learning Models for Biomedical Applications
    Prasad, Athira
    Santra, Tuhin Subhra
    Jayaganthan, Rengaswamy
    [J]. METALS, 2024, 14 (05)
  • [9] Evaluating machine learning models in predicting GRI drought indicators (case study: Ajabshir area)
    Faramarzpour, Mahtab
    Saremi, Ali
    Khosrojerdi, Amir
    Babazadeh, Hossain
    [J]. APPLIED WATER SCIENCE, 2024, 14 (09)
  • [10] Evaluating the performance and external validity of machine learning-based prediction models in liver transplantation: an international study
    Ivanics, T.
    So, D.
    Claasen, M.
    Wallace, D.
    Patel, M.
    Gravely, A.
    Walker, K.
    Cowling, T.
    Erdman, L.
    Sapisochin, G.
    [J]. TRANSPLANTATION, 2022, 106 (8S) : 6 - 6