Protein Classification Using N-gram Technique and Association Rules

被引:1
|
作者
Kabli, Fatima [1 ]
Hamou, Reda Mohamed [1 ]
Amine, Abdelmalek [1 ]
机构
[1] Tahar Moulay Univ Saida, Dept Comp Sci, GeCode Lab, Saida, Algeria
关键词
Apriori; Classification; KDD Process; N-Gram; Protein Sequences; Relevant Association Rules;
D O I
10.4018/IJSI.2018040106
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The knowledge extraction process from biological data is increasingly being considered, it addresses general issues such as grouping, classification and association; The Protein classification is an important activity for the biologist to respond to biological needs. For this reason, the authors present a global framework inspired by the knowledge extraction process from biological data to classified proteins from their primary structure based on the association rules. This framework has three main steps: The first one is, the pre-processing phase, consists of extracting descriptors by N-Gram technique. The second is the extraction of associations rules, applying the Apriori algorithm. The third step is selecting the relevant rules, and applied the classifier. The experiments of this technique were performed on five classes of protein, extracted from UniProt data bank and compared with five classification methods in the WEKA platform. The obtained results satisfied the authors' purpose to propose an effective protein classifier supported by the N-gram technique and the Apriori algorithm.
引用
收藏
页码:77 / 89
页数:13
相关论文
共 50 条
  • [1] Classification of facemarks using N-gram
    Yamada, Thichi
    Tsuchiya, Seiji
    Kuroiwa, Shiongo
    Ren, Fuji
    [J]. PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (NLP-KE'07), 2007, : 322 - +
  • [2] Sentiment Analysis Using N-gram Technique
    Chidananda, Himadri Tanaya
    Das, Debashis
    Sagnika, Santwana
    [J]. PROGRESS IN COMPUTING, ANALYTICS AND NETWORKING, ICCAN 2017, 2018, 710 : 359 - 367
  • [3] Text authorship detection using decision trees and association rules over N-gram
    Course of Information and Computer Sciences, Graduate School of Kanagawa Institute of Technology, 1030 Shimo-ogino, Atsugi-shi, Kanagawa 243-0292, Japan
    [J]. Proc. IADIS Int. Conf. Intelligent Syst. Agents, Proc. IADIS Eur. Conf. Data Min., Part MCCSIS, (167-170):
  • [4] Protein Classification using Modified N-gram and Skip-gram Models Extended Abstract
    Islam, S. M. Ashiqul
    Kearney, Christopher Michel
    Choudhury, Ankan
    Baker, Erich J.
    [J]. ACM-BCB' 2017: PROCEEDINGS OF THE 8TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY,AND HEALTH INFORMATICS, 2017, : 586 - 586
  • [5] An ensemble text classification model combining strong rules and N-Gram
    Liu, Jinhong
    Lu, Yuliang
    [J]. ICNC 2007: THIRD INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 3, PROCEEDINGS, 2007, : 535 - +
  • [6] Intelligent Assessment Using Variable N-gram Technique
    Kar, Sadhu Prasad
    Chatterjee, Rajeev
    Mandal, Jyotsna Kumar
    [J]. IMPACT OF THE 4TH INDUSTRIAL REVOLUTION ON ENGINEERING EDUCATION, ICL2019, VOL 2, 2020, 1135 : 30 - 37
  • [7] Distributing N-Gram Graphs for Classification
    Kontopoulos, Ioannis
    Giannakopoulos, George
    Varlamis, Iraklis
    [J]. NEW TRENDS IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2017, 2017, 767 : 3 - 11
  • [8] Chinese Personal Name Recognition Using N-gram Model and Rules
    Chen Lin
    Zhang Hui
    Li Zhen'an
    [J]. 2012 7TH INTERNATIONAL CONFERENCE ON COMPUTING AND CONVERGENCE TECHNOLOGY (ICCCT2012), 2012, : 450 - 453
  • [9] Automatic Chinese Text Classification Using N-Gram Model
    Yen, Show-Jane
    Lee, Yue-Shi
    Wu, Yu-Chieh
    Ying, Jia-Ching
    Tseng, Vincent S.
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2010, PT 3, PROCEEDINGS, 2010, 6018 : 458 - +
  • [10] Document classification using n-gram and word semantic similarity
    Ren, Mei-Ying
    Kang, Sinjae
    [J]. International Journal of Future Generation Communication and Networking, 2015, 8 (08): : 111 - 118