Holistic in silico developability assessment of novel classes of small proteins using publicly available sequence-based predictors

被引:0
|
作者
Pais, Daniel A. M. [1 ]
Mayer, Jan-Peter A. [2 ]
Felderer, Karin [2 ]
Batalha, Maria B. [1 ]
Eichner, Timo [2 ]
Santos, Sofia T. [1 ]
Kumar, Raman [2 ]
Silva, Sandra D. [1 ]
Kaufmann, Hitto [2 ]
机构
[1] Valgenesis Portugal Lda, R Castilho 50 4th Floor, P-1250071 Lisbon, Portugal
[2] Pieris Pharmaceut GmbH, Carl Zeiss Ring 15a, D-85737 Ismaning, Germany
关键词
Developability; In silico prediction; Machine learning; Early-stage drug development; Small protein therapeutics; R PACKAGE; ANTIBODIES; SCAFFOLDS;
D O I
10.1007/s10822-024-00569-x
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The development of novel therapeutic proteins is a lengthy and costly process, with an average attrition rate of 91% (Thomas et al. Clinical Development Success Rates and Contributing Factors 2011-2020, 2021). To increase the probability of success and ensure robust drug supply beyond approval, it is essential to assess the developability profile of new potential drug candidates as early and broadly as possible in development (Jain et al. MAbs, 2023. https://doi.org/10.1016/j.copbio.2011.06.002). Predicting these properties in silico is expected to be the next leap in innovation as it would enable significantly reduced development timelines combined with broader screens at lower costs. However, developing predictive algorithms typically requires substantial datasets generated under very defined conditions, a limiting factor especially for new classes of therapeutic proteins that hold immense clinical promise. Here we describe a strategy for assessing the developability of a novel class of small therapeutic Anticalin (R) proteins using machine learning in conjunction with a knowledge-driven approach. The knowledge-driven approach considers developability attributes such as aggregation propensity, charge variants, immunogenicity, specificity, thermal stability, hydrophobicity, and potential post-translational modifications, to calculate a holistic developability score. Based on sequence-derived descriptors as input parameters we established novel statistical models designed to predict the developability scores for Anticalin proteins. The best models yielded low root mean square errors across the entire dataset and were further validated by removing input data from individual screening campaigns and predicting developability scores for those drug candidates. The adoption of the described workflow will enable significantly streamlined preclinical development of Anticalin drug candidates and could potentially be applied to other therapeutic protein scaffolds. [GRAPHICS]
引用
收藏
页数:14
相关论文
共 5 条
  • [1] A novel sequence-based predictor for identifying and characterizing thermophilic proteins using estimated propensity scores of dipeptides
    Phasit Charoenkwan
    Warot Chotpatiwetchkul
    Vannajan Sanghiran Lee
    Chanin Nantasenamat
    Watshara Shoombuatong
    [J]. Scientific Reports, 11
  • [2] A novel sequence-based predictor for identifying and characterizing thermophilic proteins using estimated propensity scores of dipeptides
    Charoenkwan, Phasit
    Chotpatiwetchkul, Warot
    Lee, Vannajan Sanghiran
    Nantasenamat, Chanin
    Shoombuatong, Watshara
    [J]. SCIENTIFIC REPORTS, 2021, 11 (01)
  • [3] A Novel Sequence-Based Feature for the Identification of DNA-Binding Sites in Proteins Using Jensen-Shannon Divergence
    Dang, Truong Khanh Linh
    Meckbach, Cornelia
    Tacke, Rebecca
    Waack, Stephan
    Gueltas, Mehmet
    [J]. ENTROPY, 2016, 18 (10)
  • [4] Genome-wide sequence-based prediction of peripheral proteins using a novel semi-supervised learning technique
    Bhardwaj, Nitin
    Gerstein, Mark
    Lu, Hui
    [J]. BMC BIOINFORMATICS, 2010, 11
  • [5] Genome-wide sequence-based prediction of peripheral proteins using a novel semi-supervised learning technique
    Nitin Bhardwaj
    Mark Gerstein
    Hui Lu
    [J]. BMC Bioinformatics, 11