Compact Class-Conditional Attribute Category Clustering: Amino Acid Grouping for Enhanced HIV-1 Protease Cleavage Classification

被引:0
|
作者
Saez, Jose A. [1 ]
Vera, J. Fernando [1 ]
机构
[1] Univ Granada, Dept Stat & Operat Res, Granada 18071, Spain
关键词
Amino acids; Classification algorithms; Encoding; Prediction algorithms; Complexity theory; Proposals; Predictive models; HIV-1; protease; octamer cleavage; data representation; category grouping; classification; STATISTICAL TESTS; EVOLUTIONARY; PREDICTION;
D O I
10.1109/TCBB.2024.3448617
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Categorical attributes are common in many classification tasks, presenting certain challenges as the number of categories grows. This situation can affect data handling, negatively impacting the building time of models, their complexity and, ultimately, their classification performance. In order to mitigate these issues, this research proposes a novel preprocessing technique for grouping attribute categories in classification datasets. This approach combines the exact representation of the association between categorical values in a Euclidean space, clustering methods and attribute quality metrics to group similar attribute categories based on their contribution to the classification task. To estimate its effectiveness, the proposal is evaluated within the context of HIV-1 protease cleavage site prediction, where each attribute represents an amino acid that can take multiple possible values. The results obtained on HIV-1 real-world datasets show a significant reduction in the number of categories per attribute, with an average reduction percentage ranging from 74% to 81%. This reduction leads to simplified data representations and improved classification performances compared to not preprocessing. Specifically, improvements of up to 0.07 in accuracy and 0.19 in geometric mean are observed across different datasets and classification algorithms. Additionally, extensive simulations on synthetic datasets with varied characteristics are carried out, providing consistent and reliable results that validate the robustness of the proposal. These findings highlight the capability of the developed method to enhance cleavage prediction, which could potentially contribute to understanding viral processes and developing targeted therapeutic strategies.
引用
收藏
页码:2167 / 2178
页数:12
相关论文
共 44 条
  • [1] HIV-1 Protease Cleavage Site Prediction Based on Amino Acid Property
    Niu, Bing
    Lu, Lin
    Liu, Liang
    Gu, Tian Hong
    Feng, Kai-Yan
    Lu, Wen-Cong
    Cai, Yu-Dong
    JOURNAL OF COMPUTATIONAL CHEMISTRY, 2009, 30 (01) : 33 - 39
  • [2] Covariation of amino acid positions in HIV-1 protease
    Hoffman, NG
    Schiffer, CA
    Swanstrom, R
    VIROLOGY, 2003, 314 (02) : 536 - 548
  • [3] Deep recurrent neural networks in HIV-1 protease cleavage classification
    Shayanfar, Nima
    Derhami, Vali
    Rezaeian, Mehdi
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2017, 19 (04) : 298 - 311
  • [4] CLEAVAGE OF THE AMINO-TERMINAL DOMAIN OF CD36 BY HIV-1 PROTEASE
    GREENWALT, DE
    PROTEIN AND PEPTIDE LETTERS, 1995, 2 (02): : 351 - 354
  • [5] Cognitive Framework for HIV-1 Protease Cleavage Site Classification Using Evolutionary Algorithm
    Deepak Singh
    Dilip Singh Sisodia
    Pradeep Singh
    Arabian Journal for Science and Engineering, 2019, 44 : 9007 - 9027
  • [6] Cognitive Framework for HIV-1 Protease Cleavage Site Classification Using Evolutionary Algorithm
    Singh, Deepak
    Sisodia, Dilip Singh
    Singh, Pradeep
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2019, 44 (11) : 9007 - 9027
  • [7] HIV-1 subtype B protease and reverse transcriptase amino acid covariation
    Rhee, Soo-Yon
    Liu, Tommy F.
    Holmes, Susan P.
    Shafer, Robert W.
    PLOS COMPUTATIONAL BIOLOGY, 2007, 3 (05) : 836 - 843
  • [8] Effects of amino acid and synonymous polymorphisms in HIV-1 protease on viral fitness
    Frost, SDW
    Pond, SKK
    Grossman, Z
    Daar, E
    Condra, J
    Richman, DD
    Little, SJ
    Brown, AJL
    ANTIVIRAL THERAPY, 2003, 8 (03) : U77 - U78
  • [9] ISOPHTHALIC ACID-DERIVATIVES - AMINO-ACID SURROGATES FOR THE INHIBITION OF HIV-1 PROTEASE
    KALDOR, SW
    DRESSMAN, BA
    HAMMOND, M
    APPELT, K
    BURGESS, JA
    LUBBEHUSEN, PP
    MUESING, MA
    HATCH, SD
    WISKERCHEN, MA
    BAXTER, AJ
    BIOORGANIC & MEDICINAL CHEMISTRY LETTERS, 1995, 5 (07) : 721 - 726
  • [10] Amino Acid Prodrugs: An Approach to Improve the Absorption of HIV-1 Protease Inhibitor, Lopinavir
    Patel, Mitesh
    Mandava, Nanda
    Gokulgandhi, Mitan
    Pal, Dhananjay
    Mitra, Ashim K.
    PHARMACEUTICALS, 2014, 7 (04): : 433 - 452