Dive into machine learning algorithms for influenza virus host prediction with hemagglutinin sequences

被引:7
|
作者
Xu, Yanhua [1 ]
Wojtczak, Dominik [1 ]
机构
[1] Univ Liverpool, Dept Comp Sci, Liverpool L69 3BX, England
关键词
Influenza virus; Position-specific scoring matrix; Transformer; Convolutional neural network; Machine learning; AVIAN INFLUENZA; PSI-BLAST; EVOLUTION; SWINE; PERFORMANCE; EMERGENCE; INFECTION; ECOLOGY;
D O I
10.1016/j.biosystems.2022.104740
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Influenza viruses mutate rapidly and can pose a threat to public health, especially to those in vulnerable groups. Throughout history, influenza A viruses have caused pandemics between different species. It is important to identify the origin of a virus in order to prevent the spread of an outbreak. Recently, there has been increasing interest in using machine learning algorithms to provide fast and accurate predictions for viral sequences. In this study, real testing data sets and a variety of evaluation metrics were used to evaluate machine learning algorithms at different taxonomic levels. As hemagglutinin is the major protein in the immune response, only hemagglutinin sequences were used and represented by position-specific scoring matrix and word embedding. The results suggest that the 5-grams-transformer neural network is the most effective algorithm for predicting viral sequence origins, with approximately 99.54% AUCPR, 98.01% F-1 score and 96.60% MCC at a higher classification level, and approximately 94.74% AUCPR, 87.41% F(1 )score and 80.79% MCC at a lower classification level.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Cleavage of influenza virus hemagglutinin by host cell proteases
    Garten, W
    Matrosovich, M
    Matrosovich, T
    Eickmann, M
    Vahhabzadeh, A
    [J]. OPTIONS FOR THE CONTROL OF INFLUENZA V, 2004, 1263 : 218 - 221
  • [2] A Study on Host Tropism Determinants of Influenza Virus Using Machine Learning
    Kwon, Eunmi
    Cho, Myeongji
    Kim, Hayeon
    Son, Hyeon S.
    [J]. CURRENT BIOINFORMATICS, 2020, 15 (02) : 121 - 134
  • [3] HOST ANTIGEN AS SULFATED MOIETY OF INFLUENZA-VIRUS HEMAGGLUTININ
    DOWNIE, JC
    [J]. JOURNAL OF GENERAL VIROLOGY, 1978, 41 (NOV): : 283 - 293
  • [4] Prediction of damage potential in mainshock–aftershock sequences using machine learning algorithms
    Zhou, Zhou
    Wang, Meng
    Han, Miao
    Yu, Xiaohui
    Lu, Dagang
    [J]. Earthquake Engineering and Engineering Vibration, 2024, 23 (04) : 919 - 938
  • [5] Prediction of damage potential in mainshock–aftershock sequences using machine learning algorithms
    Zhou Zhou
    Wang Meng
    Han Miao
    Yu Xiaohui
    Lu Dagang
    [J]. Earthquake Engineering and Engineering Vibration., 2024, 23 (04) - 938
  • [6] Receptor Binding Properties of the Influenza Virus Hemagglutinin as a Determinant of Host Range
    Xiong, Xiaoli
    McCauley, John W.
    Steinhauer, David A.
    [J]. INFLUENZA PATHOGENESIS AND CONTROL - VOL I, 2014, 385 : 63 - 91
  • [7] OThe Interplay between the Host Receptor and Influenza Virus Hemagglutinin and Neuraminidase
    Byrd-Leotis, Lauren
    Cummings, Richard D.
    Steinhauer, David A.
    [J]. INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2017, 18 (07)
  • [8] Contributions of Influenza Virus Hemagglutinin and Host Immune Responses Toward the Severity of Influenza Virus: Streptococcus pyogenes Superinfections
    Klonoski, Joshua M.
    Watson, Trevor
    Bickett, Thomas E.
    Svendsen, Joshua M.
    Gau, Tonia J.
    Britt, Alexandra
    Nelson, Jeff T.
    Schlenker, Evelyn H.
    Chaussee, Michael S.
    Rynda-Apple, Agnieszka
    Huber, Victor C.
    [J]. VIRAL IMMUNOLOGY, 2018, 31 (06) : 457 - 469
  • [9] Prediction of hospital-acquired influenza using machine learning algorithms: a comparative study
    Cho, Younghee
    Lee, Hyang Kyu
    Kim, Joungyoun
    Yoo, Ki-Bong
    Choi, Jongrim
    Lee, Yongseok
    Choi, Mona
    [J]. BMC INFECTIOUS DISEASES, 2024, 24 (01)
  • [10] Triplet entropy analysis of hemagglutinin and neuraminidase sequences measures influenza virus phylodynamics
    Gerhardt, Guenther J. L.
    Takeda, Agnes A. S.
    Andrighetti, Tahila
    Sartor, Ivaine T. S.
    Echeverrigaray, Sergio L.
    de Avila e Silva, Scheila
    dos Santos, Laurita
    Rybarczyk-Filho, Jose L.
    [J]. GENE, 2013, 528 (02) : 277 - 281