Protein Classification Using N-gram Technique and Association Rules

被引:1
|
作者
Kabli, Fatima [1 ]
Hamou, Reda Mohamed [1 ]
Amine, Abdelmalek [1 ]
机构
[1] Tahar Moulay Univ Saida, Dept Comp Sci, GeCode Lab, Saida, Algeria
关键词
Apriori; Classification; KDD Process; N-Gram; Protein Sequences; Relevant Association Rules;
D O I
10.4018/IJSI.2018040106
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The knowledge extraction process from biological data is increasingly being considered, it addresses general issues such as grouping, classification and association; The Protein classification is an important activity for the biologist to respond to biological needs. For this reason, the authors present a global framework inspired by the knowledge extraction process from biological data to classified proteins from their primary structure based on the association rules. This framework has three main steps: The first one is, the pre-processing phase, consists of extracting descriptors by N-Gram technique. The second is the extraction of associations rules, applying the Apriori algorithm. The third step is selecting the relevant rules, and applied the classifier. The experiments of this technique were performed on five classes of protein, extracted from UniProt data bank and compared with five classification methods in the WEKA platform. The obtained results satisfied the authors' purpose to propose an effective protein classifier supported by the N-gram technique and the Apriori algorithm.
引用
收藏
页码:77 / 89
页数:13
相关论文
共 50 条
  • [31] Arithmetic N-gram: an efficient data compression technique
    Hassan, Ali
    Javed, Sadaf
    Hussain, Sajjad
    Ahmad, Rizwan
    Qazi, Shams
    [J]. DISCOVER COMPUTING, 2024, 27 (01)
  • [32] MQVC: Measuring quranic verses similarity and sura classification using N-gram
    Akour, Mohammed
    Alsmadi, Izzat
    Alazzam, Iyad
    [J]. WSEAS Transactions on Computers, 2014, 13 : 485 - 491
  • [33] Protein Sequence Classification Based on N-Gram and K-Nearest Neighbor Algorithm
    Dongardive, Jyotshna
    Abraham, Siby
    [J]. COMPUTATIONAL INTELLIGENCE IN DATA MINING, CIDM, VOL 2, 2016, 411 : 163 - 171
  • [34] N-gram adaptation with dynamic interpolation coefficient using information retrieval technique
    Choi, Joon-Ki
    Oh, Yung-Hwan
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (09): : 2579 - 2582
  • [35] Active Learning for Language Identification with N-gram Technique
    Feng , Yuxin
    [J]. 2021 2ND INTERNATIONAL CONFERENCE ON BIG DATA & ARTIFICIAL INTELLIGENCE & SOFTWARE ENGINEERING (ICBASE 2021), 2021, : 560 - 564
  • [36] n-BiLSTM: BiLSTM with n-gram Features for Text Classification
    Zhang, Yunxiang
    Rao, Zhuyi
    [J]. PROCEEDINGS OF 2020 IEEE 5TH INFORMATION TECHNOLOGY AND MECHATRONICS ENGINEERING CONFERENCE (ITOEC 2020), 2020, : 1056 - 1059
  • [37] N-gram Insight
    Prans, George
    [J]. AMERICAN SCIENTIST, 2011, 99 (05) : 356 - 357
  • [38] DNA N-gram analysis framework (DNAnamer): A generalized N-gram frequency analysis framework for the supervised classification of DNA sequences
    Malamon, John S.
    [J]. HELIYON, 2024, 10 (17)
  • [39] Lipreading Using n-Gram Feature Vector
    Singh, Preety
    Laxmi, Vijay
    Gupta, Deepika
    Gaur, M. S.
    [J]. COMPUTATIONAL INTELLIGENCE IN SECURITY FOR INFORMATION SYSTEMS 2010, 2010, 85 : 81 - 88
  • [40] n-Gram Analysis of COG Categorized Protein Sequences
    Marovac, Ulfeta A.
    Mitic, Nenad S.
    [J]. MATCH-COMMUNICATIONS IN MATHEMATICAL AND IN COMPUTER CHEMISTRY, 2015, 74 (03) : 575 - 590