A context evaluation approach for structural comparison of proteins using cross entropy over n-gram modelling

被引:0
|
作者
Razmara, Jafar [1 ]
Deris, Safaai B. [1 ]
Parvizpour, Sepideh [2 ]
机构
[1] Univ Teknol Malaysia, Fac Comp, Johor Baharu, Malaysia
[2] Univ Teknol Malaysia, Fac Biosci & Med Engn, Johor Baharu, Malaysia
关键词
Protein structure comparison; Structure alignment; Sequence alignment; Text modelling; STRUCTURE ALIGNMENT; STRUCTURE DATABASE; SEARCH; SIMILARITY; ALPHABET; TOOL;
D O I
10.1016/j.compbiomed.2013.07.022
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The structural comparison of proteins is a vital step in structural biology that is used to predict and analyse a new unknown protein function. Although a number of different techniques have been explored, the study to develop new alternative methods is still an active research area. The present paper introduces a text modelling-based technique for the structural comparison of proteins. The method models the secondary and tertiary structure of proteins in two linear sequences and then applies them to the comparison of two structures. The technique used for pairwise comparison of the sequences has been adopted from computational linguistics and its well-known techniques for analysing and quantifying textual sequences. To this end, an n-gram modelling technique is used to capture regularities between sequences, and then, the cross-entropy concept is employed to measure their similarities. Several experiments are conducted to evaluate the performance of the method and compare it with other commonly used programs. The assessments for information retrieval evaluation demonstrate that the technique has a high running speed, which is similar to other linear encoding methods, such as 3D-BLAST, SARST, and TS-AMIR, whereas its accuracy is comparable to CE and TM-align, which are high accuracy comparison tools. Accordingly, the results demonstrate that the algorithm has high efficiency compared with other state-of-the-art methods. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1614 / 1621
页数:8
相关论文
共 42 条
  • [1] N-gram over Context
    Kawamae, Noriaki
    PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'16), 2016, : 1045 - 1055
  • [2] Improving cross-domain n-gram language modelling with skipgrams
    Onrust, Louis
    van den Bosch, Antal
    Van Hamme, Hugo
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2016), VOL 2, 2016, : 137 - 142
  • [3] Evaluation of N-gram term conflation approach for arabic texts
    Abu-Salem, H
    PROCEEDINGS OF 2003 INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE & ENGINEERING, VOLS I AND II, 2003, : 2561 - 2567
  • [4] Malayalam OCR: N-gram approach Using SVM Classifier
    Jia, Ashitta T.
    Ayappally, Yahkoob
    Syama, K.
    2013 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2013, : 1799 - 1803
  • [5] Classification of sentiment reviews using n-gram machine learning approach
    Tripathy, Abinash
    Agrawal, Ankit
    Rath, Santanu Kumar
    EXPERT SYSTEMS WITH APPLICATIONS, 2016, 57 : 117 - 126
  • [6] Evaluation of action prediction method using inductive learning with N-gram
    Xu, JA
    Itoh, T
    Araki, K
    Tochinai, K
    2004 7TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS 1-3, 2004, : 1605 - 1609
  • [7] Context-Sensitive Arabic Spell Checker using Context Words and N-gram Language Models
    Al-Jefri, Majed M.
    Mahmoud, Sabri A.
    2013 TAIBAH UNIVERSITY INTERNATIONAL CONFERENCE ON ADVANCES IN INFORMATION TECHNOLOGY FOR THE HOLY QURAN AND ITS SCIENCES, 2013, : 258 - 263
  • [8] Automatic evaluation of summaries using N-gram co-occurrence statistics
    Lin, CY
    Hovy, E
    HLT-NAACL 2003: HUMAN LANGUAGE TECHNOLOGY CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE MAIN CONFERENCE, 2003, : 150 - 157
  • [9] An Iterative Relative Entropy Minimization-Based Data Selection Approach for n-Gram Model Adaptation
    Sethy, Abhinav
    Georgiou, Panayiotis G.
    Ramabhadran, Bhuvana
    Narayanan, Shrikanth
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (01): : 13 - 23
  • [10] An N-gram Based Chinese Syllable Evaluation Approach for Speech Recognition Error Detection
    Wang, Xingjian
    Li, Lei
    IEEE NLP-KE 2009: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2009, : 224 - 229