Compact Class-Conditional Attribute Category Clustering: Amino Acid Grouping for Enhanced HIV-1 Protease Cleavage Classification

被引:0
|
作者
Saez, Jose A. [1 ]
Vera, J. Fernando [1 ]
机构
[1] Univ Granada, Dept Stat & Operat Res, Granada 18071, Spain
关键词
Amino acids; Classification algorithms; Encoding; Prediction algorithms; Complexity theory; Proposals; Predictive models; HIV-1; protease; octamer cleavage; data representation; category grouping; classification; STATISTICAL TESTS; EVOLUTIONARY; PREDICTION;
D O I
10.1109/TCBB.2024.3448617
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Categorical attributes are common in many classification tasks, presenting certain challenges as the number of categories grows. This situation can affect data handling, negatively impacting the building time of models, their complexity and, ultimately, their classification performance. In order to mitigate these issues, this research proposes a novel preprocessing technique for grouping attribute categories in classification datasets. This approach combines the exact representation of the association between categorical values in a Euclidean space, clustering methods and attribute quality metrics to group similar attribute categories based on their contribution to the classification task. To estimate its effectiveness, the proposal is evaluated within the context of HIV-1 protease cleavage site prediction, where each attribute represents an amino acid that can take multiple possible values. The results obtained on HIV-1 real-world datasets show a significant reduction in the number of categories per attribute, with an average reduction percentage ranging from 74% to 81%. This reduction leads to simplified data representations and improved classification performances compared to not preprocessing. Specifically, improvements of up to 0.07 in accuracy and 0.19 in geometric mean are observed across different datasets and classification algorithms. Additionally, extensive simulations on synthetic datasets with varied characteristics are carried out, providing consistent and reliable results that validate the robustness of the proposal. These findings highlight the capability of the developed method to enhance cleavage prediction, which could potentially contribute to understanding viral processes and developing targeted therapeutic strategies.
引用
收藏
页码:2167 / 2178
页数:12
相关论文
共 44 条
  • [21] A Single Amino Acid Substitution at the HIV-1 Protease Termini Dimer Interface Significantly Reduces Viral Particles Processing Efficiency
    Chiang, Meichun
    Wang, Chintien
    JAPANESE JOURNAL OF INFECTIOUS DISEASES, 2021, 74 (04) : 299 - 306
  • [22] Amino acid insertions at position 35 of HIV-1 protease interfere with virus replication without modifying antiviral drug susceptibility
    Paolucci, S
    Baldanti, F
    Dossena, L
    Gerna, G
    ANTIVIRAL RESEARCH, 2006, 69 (03) : 181 - 185
  • [23] A bioinformatic approach to identify new potential resistance relevant amino acid substitutions (AAS) in HIV-1 protease (H1P)
    Casper M Frederiksen
    Jesper Kjær
    Alessandro Cozzi-Lepri
    Zoe Fox
    Jens D Lundgren
    Retrovirology, 6
  • [24] A bioinformatic approach to identify new potential resistance relevant amino acid substitutions (AAS) in HIV-1 protease (H1P)
    Frederiksen, Casper M.
    Kjaer, Jesper
    Cozzi-Lepri, Alessandro
    Fox, Zoe
    Lundgren, Jens D.
    RETROVIROLOGY, 2009, 6
  • [25] Natural variation in HIV-1 protease, gag p7 and p6, and protease cleavage sites within Gag/Pol polyproteins: Amino acid substitutions in the absence of protease inhibitors in mothers and children infected by human immunodeficiency virus type 1
    Barrie, KA
    Perez, E
    Lamers, SL
    Farmerie, WG
    Dunn, BM
    Sleasman, JW
    Goodenow, MM
    VIROLOGY, 1996, 219 (02) : 407 - 416
  • [26] Molecular tongs containing amino acid mimetic fragments:: New inhibitors of wild-type and mutated HIV-1 protease dimerization
    Bannwarth, Ludovic
    Kessler, Albane
    Pethe, Stephanie
    Collinet, Bruno
    Merabet, Naima
    Boggetto, Nicole
    Sicsic, Sames
    Reboud-Ravaux, Michele
    Ongeri, Sandrine
    JOURNAL OF MEDICINAL CHEMISTRY, 2006, 49 (15) : 4657 - 4664
  • [27] INVESTIGATING THE STEREOCHEMISTRY OF BINDING TO HIV-1 PROTEASE WITH INHIBITORS CONTAINING ISOMERS OF 4-AMINO-3-HYDROXY-5-PHENYLPENTANOIC ACID
    RAJU, B
    DESHPANDE, MS
    BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 1991, 180 (01) : 187 - 190
  • [28] Design, Synthesis, and Pharmacokinetic Evaluation of Phosphate and Amino Acid Ester Prodrugs for Improving the Oral Bioavailability of the HIV-1 Protease Inhibitor Atazanavir
    Subbaiah, Murugaiah A. M.
    Mandlekar, Sandhya
    Desikan, Sridhar
    Ramar, Thangeswaran
    Subramani, Lakshumanan
    Annadurai, Mathiazhagan
    Desai, Salil D.
    Sinha, Sarmistha
    Jenkins, Susan M.
    Krystal, Mark R.
    Subramanian, Murali
    Sridhar, Srikanth
    Padmanabhan, Shweta
    Bhutani, Priyadeep
    Arla, Rambabu
    Singh, Shashyendra
    Sinha, Jaydeep
    Thakur, Megha
    Kadow, John F.
    Meanwell, Nicholas A.
    JOURNAL OF MEDICINAL CHEMISTRY, 2019, 62 (07) : 3553 - 3574
  • [29] Distinct Geographical Clustering of HIV-1 and a Signature Amino Acid at Position 41 of the p24 Unveiled by gag Variability in India
    Acharya, Arpan
    Vaniawala, Salil
    Parekh, Harsh
    Nagee, Anju
    Misra, Rabindra N.
    Wani, Minal
    Mukhopadhyaya, Pratap N.
    CURRENT HIV RESEARCH, 2013, 11 (04) : 295 - 303
  • [30] HIV-1Gag C-terminal amino acid substitutions emerging under selective pressure of protease inhibitors in patient populations infected with different HIV-1 subtypes
    Li, Guangdi
    Verheyen, Jens
    Theys, Kristof
    Piampongsant, Supinya
    Van Laethem, Kristel
    Vandamme, Anne-Mieke
    RETROVIROLOGY, 2014, 11