Measuring Similarity among Protein Sequences Using a New Descriptor

被引:7
|
作者
Abo-Elkhier, Mervat M. [1 ]
Abd Elwahaab, Marwa A. [1 ]
Abo El Maaty, Moheb I. [1 ]
机构
[1] Mansoura Univ, Dept Engn Math & Phys, Fac Engn, Mansoura 35516, Egypt
关键词
2-D GRAPHICAL REPRESENTATION; PHYSICOCHEMICAL PROPERTIES; DNA-SEQUENCES; ALIGNMENT; SEARCH; 2D;
D O I
10.1155/2019/2796971
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
The comparison of protein sequences according to similarity is a fundamental aspect of today's biomedical research. With the developments of sequencing technologies, a large number of protein sequences increase exponentially in the public databases. Famous sequences' comparison methods are alignment based. They generally give excellent results when the sequences under study are closely related and they are time consuming. Herein, a new alignment-free method is introduced. Our technique depends on a new graphical representation and descriptor. The graphical representation of protein sequence is a simple way to visualize protein sequences. The descriptor compresses the primary sequence into a single vector composed of only two values. Our approach gives good results with both short and long sequences within a little computation time. It is applied on nine beta globin, nine ND5 (NADH dehydrogenase subunit 5), and 24 spike protein sequences. Correlation and significance analyses are also introduced to compare our similarity/dissimilarity results with others' approaches, results, and sequence homology.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] A new similarity measure among protein sequences
    Wu, KP
    Lin, HN
    Sung, TY
    Hsu, WL
    PROCEEDINGS OF THE 2003 IEEE BIOINFORMATICS CONFERENCE, 2003, : 347 - 352
  • [2] An efficient method for measuring the similarity of protein sequences
    El-Lakkani, A.
    Lashin, M.
    SAR AND QSAR IN ENVIRONMENTAL RESEARCH, 2016, 27 (05) : 363 - 370
  • [3] On measuring similarity for sequences of itemsets
    Egho, Elias
    Raissi, Chedy
    Calders, Toon
    Jay, Nicolas
    Napoli, Amedeo
    DATA MINING AND KNOWLEDGE DISCOVERY, 2015, 29 (03) : 732 - 764
  • [4] On measuring similarity for sequences of itemsets
    Elias Egho
    Chedy Raïssi
    Toon Calders
    Nicolas Jay
    Amedeo Napoli
    Data Mining and Knowledge Discovery, 2015, 29 : 732 - 764
  • [5] A Novel Descriptor for Protein Similarity Analysis
    He, Ping-an
    Li, Xiao-fang
    Yang, Jia-liang
    Wang, Jun
    MATCH-COMMUNICATIONS IN MATHEMATICAL AND IN COMPUTER CHEMISTRY, 2011, 65 (02) : 445 - 458
  • [6] Grid Shape Descriptor Using Path Integrals for Measuring Sheet Metal Parts Similarity
    Ylihärsilä, Mikko Y.
    Hirvonen, Juha
    Computer-Aided Design and Applications, 2022, 19 (04): : 712 - 721
  • [7] Similarity among nucleotide sequences
    Shi, F
    Mo, ZX
    ACTA BIOTHEORETICA, 2002, 50 (02) : 95 - 99
  • [8] Similarity among Nucleotides Sequences
    Feng Shi
    Zhongxi Mo
    Acta Biotheoretica, 2002, 50 : 95 - 99
  • [9] Similarity/Dissimilarity Analysis of Protein Sequences by a New Graphical Representation
    Huang, Guohua
    Hu, Jerry
    CURRENT BIOINFORMATICS, 2013, 8 (05) : 539 - 544
  • [10] A new graphical representation of similarity/dissimilarity studies of protein sequences
    He, P.
    SAR AND QSAR IN ENVIRONMENTAL RESEARCH, 2010, 21 (5-6) : 571 - 580