Normalized Feature Vectors: A Novel Alignment-Free Sequence Comparison Method Based on the Numbers of Adjacent Amino Acids

被引:133
|
作者
Huang, De-Shuang [1 ]
Yu, Hong-Jie [2 ]
机构
[1] Tongji Univ, Sch Elect & Informat Engn, Shanghai 201804, Peoples R China
[2] Anhui Sci & Technol Univ, Sch Sci, Dept Math, Fengyang 233100, Anhui, Peoples R China
基金
中国博士后科学基金; 美国国家科学基金会;
关键词
Adjacent amino acids; normalized feature vector; singular value decomposition (SVD); alignment free; similarity analysis; 2D GRAPHICAL REPRESENTATION; PROTEIN SEQUENCES; PHYSICOCHEMICAL PROPERTIES; DNA-SEQUENCES; PHYLOGENY; EVOLUTION; DISTANCE;
D O I
10.1109/TCBB.2013.10
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Based on all kinds of adjacent amino acids (AAA), we map each protein primary sequence into a 400 by (L - 1) matrix M. In addition, we further derive a normalized 400-tuple mathematical descriptors D, which is extracted from the primary protein sequences via singular values decomposition (SVD) of the matrix. The obtained 400-D normalized feature vectors (NFVs) further facilitate our quantitative analysis of protein sequences. Using the normalized representation of the primary protein sequences, we analyze the similarity for different sequences upon two data sets: 1) ND5 sequences from nine species and 2) transferrin sequences of 24 vertebrates. We also compared the results in this study with those from other related works. These two experiments illustrate that our proposed NFV-AAA approach does perform well in the field of similarity analysis of sequence.
引用
收藏
页码:457 / 467
页数:11
相关论文
共 40 条
  • [1] Protein map: An alignment-free sequence comparison method based on various properties of amino acids
    Yu, Chenglong
    Cheng, Shiu-Yuen
    He, Rong L.
    Yau, Stephen S. -T.
    [J]. GENE, 2011, 486 (1-2) : 110 - 118
  • [2] An alignment-free measure based on physicochemical properties of amino acids for protein sequence comparison
    Zhao, Yunxiu
    Xue, Xiaolong
    Xie, Xiaoli
    [J]. COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2019, 80 : 10 - 15
  • [3] An accurate alignment-free protein sequence comparator based on physicochemical properties of amino acids
    Saeedeh Akbari Rokn Abadi
    Azam Sadat Abdosalehi
    Faezeh Pouyamehr
    Somayyeh Koohi
    [J]. Scientific Reports, 12
  • [4] An accurate alignment-free protein sequence comparator based on physicochemical properties of amino acids
    Abadi, Saeedeh Akbari Rokn
    Abdosalehi, Azam Sadat
    Pouyamehr, Faezeh
    Koohi, Somayyeh
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01)
  • [5] Mapping sequence to feature vector using numerical representation of codons targeted to amino acids for alignment-free sequence analysis
    Das, Jayanta Kumar
    Sengupta, Antara
    Choudhury, Pabitra Pal
    Roy, Swarup
    [J]. GENE, 2021, 766
  • [6] A phylogenetic analysis of the Brassicales clade based on an alignment-free sequence comparison method
    Hatje, Klas
    Kollmar, Martin
    [J]. FRONTIERS IN PLANT SCIENCE, 2012, 3
  • [7] Statistical considerations underpinning an alignment-free sequence comparison method
    Junmei Jing
    Conrad J. Burden
    Sylvain Forêt
    Susan R. Wilson
    [J]. Journal of the Korean Statistical Society, 2010, 39 : 325 - 335
  • [8] Statistical considerations underpinning an alignment-free sequence comparison method
    Jing, Junmei
    Burden, Conrad J.
    Foret, Sylvain
    Wilson, Susan R.
    [J]. JOURNAL OF THE KOREAN STATISTICAL SOCIETY, 2010, 39 (03) : 325 - 335
  • [9] Weighted measures based on maximizing deviation for alignment-free sequence comparison
    Qian, Kun
    Luan, Yihui
    [J]. PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2017, 481 : 235 - 242
  • [10] SWSPM: A Novel Alignment-Free DNA Comparison Method Based on Signal Processing Approaches
    Farkas, Tomas
    Sitarcik, Jozef
    Brejova, Brona
    Lucka, Maria
    [J]. EVOLUTIONARY BIOINFORMATICS, 2019, 15