Measuring Similarity among Protein Sequences Using a New Descriptor

被引:7
|
作者
Abo-Elkhier, Mervat M. [1 ]
Abd Elwahaab, Marwa A. [1 ]
Abo El Maaty, Moheb I. [1 ]
机构
[1] Mansoura Univ, Dept Engn Math & Phys, Fac Engn, Mansoura 35516, Egypt
关键词
2-D GRAPHICAL REPRESENTATION; PHYSICOCHEMICAL PROPERTIES; DNA-SEQUENCES; ALIGNMENT; SEARCH; 2D;
D O I
10.1155/2019/2796971
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
The comparison of protein sequences according to similarity is a fundamental aspect of today's biomedical research. With the developments of sequencing technologies, a large number of protein sequences increase exponentially in the public databases. Famous sequences' comparison methods are alignment based. They generally give excellent results when the sequences under study are closely related and they are time consuming. Herein, a new alignment-free method is introduced. Our technique depends on a new graphical representation and descriptor. The graphical representation of protein sequence is a simple way to visualize protein sequences. The descriptor compresses the primary sequence into a single vector composed of only two values. Our approach gives good results with both short and long sequences within a little computation time. It is applied on nine beta globin, nine ND5 (NADH dehydrogenase subunit 5), and 24 spike protein sequences. Correlation and significance analyses are also introduced to compare our similarity/dissimilarity results with others' approaches, results, and sequence homology.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Beyond descriptor vectors: QSAR modelling using structural similarity
    Andreas Zell
    G Hinselmann
    NH Fechner
    A Jahn
    Chemistry Central Journal, 2 (Suppl 1)
  • [32] A new method to analyze the similarity of the DNA sequences
    Guo, Ying
    Wang, Tian-Ming
    JOURNAL OF MOLECULAR STRUCTURE-THEOCHEM, 2008, 853 (1-3): : 62 - 67
  • [33] A New Measure for Similarity Searching in DNA Sequences
    Zhang, Yusen
    Chen, Wei
    MATCH-COMMUNICATIONS IN MATHEMATICAL AND IN COMPUTER CHEMISTRY, 2011, 65 (02) : 477 - 488
  • [34] SCS: A New Similarity Measure for Categorical Sequences
    Kelil, Abdellali
    Wang, Shengrui
    ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, : 343 - 352
  • [35] A novel method for similarity/dissimilarity analysis of protein sequences
    Mu, Zengchao
    Wu, Jing
    Zhang, Yusen
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2013, 392 (24) : 6361 - 6366
  • [36] Similarity analysis of protein sequences based on the EMD method
    Zhang, Jihong
    Zheng, Junsheng
    Bai, Fenglan
    Liu, Liwei
    Journal of Fiber Bioengineering and Informatics, 2014, 7 (03): : 387 - 395
  • [37] A time series representation of protein sequences for similarity comparison
    Li, Cancan
    Dai, Qi
    He, Ping-an
    JOURNAL OF THEORETICAL BIOLOGY, 2022, 538
  • [39] Similarity/Dissimilarity Analysis of Protein Sequences Based on a New Spectrum-Like Graphical Representation
    Yao, Yuhua
    Yan, Shoujiang
    Xu, Huimin
    Han, Jianning
    Nan, Xuying
    He, Ping-an
    Dai, Qi
    EVOLUTIONARY BIOINFORMATICS, 2014, 10 : 87 - 96
  • [40] Similarity/Dissimilarity Studies of Protein Sequences Based on a New 2D Graphical Representation
    Yao, Yu-Hua
    Dai, Qi
    Li, Ling
    Nan, Xu-Ying
    He, Ping-An
    Zhang, Yao-Zhou
    JOURNAL OF COMPUTATIONAL CHEMISTRY, 2010, 31 (05) : 1045 - 1052