Biological Data Mining for Genomic Clustering Using Unsupervised Neural Learning

被引:0
|
作者
Sen, Shreyas [1 ]
Narasimhan, Seetharam [1 ]
Konar, Amit [1 ]
机构
[1] Jadavpur Univ, Elect & Telecommun Engn Dept, Kolkata 700032, W Bengal, India
关键词
DNA-descriptors; Feature Descriptors; Principal Component Analysis (PCA); Self-Organizing Feature Map (SOFM);
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
The paper aims at designing a scheme for automatic identification of a species from its genome sequence. A set of 64 three-tuple keywords is first generated using the four types of bases: A, T, C and G. These keywords are searched on N randomly sampled genome sequences, each of a given length (10,000 elements) and the frequency count for each of the 4(3) = 64 keywords is performed to obtain a DNA-descriptor for each sample. Principal Component analysis is then employed on the DNA-descriptors for N sampled instances. The principal component analysis yields a unique feature descriptor for identifying the species from its genome sequence. The variance of the descriptors for a given genome sequence being negligible, the proposed scheme finds extensive applications in automatic species identification. An alternative approach to automatic species classification and identification of species using Self-Organizing Feature Map is also discussed in the paper. The computational map is trained by using the DNA-descriptors from different species as the training inputs. The maps for different dimensions are constructed and analyzed for optimum performance. The scheme presents a novel method for identifying a species from its genome sequence with the help of a two dimensional map of neuronal clusters, where each cluster represents a particular species. The map is shown to provide an easier technique for recognition and classification of a species based on its genomic data.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] Neural Network Data Mining Clustering Optimization Algorithm
    Jiao, Guie
    Li, Wang
    [J]. IETE JOURNAL OF RESEARCH, 2021,
  • [32] Using self-organizing maps as unsupervised learning models for meteorological data mining
    Mihai, Andrei
    [J]. 2020 IEEE 14TH INTERNATIONAL SYMPOSIUM ON APPLIED COMPUTATIONAL INTELLIGENCE AND INFORMATICS (SACI 2020), 2020, : 23 - 28
  • [33] Mining Customers' Spatio-temporal Behavior Data using Topographic Unsupervised Learning
    Cabanes, Guenael
    Bennani, Younes
    Dufau-Joel, Frederic
    [J]. EIGHTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2009, : 372 - +
  • [34] Clustering based imputation algorithm using unsupervised neural network for enhancing the quality of healthcare data
    Shobha, K.
    Savarimuthu, Nickolas
    [J]. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 12 (02) : 1771 - 1781
  • [35] Clustering based imputation algorithm using unsupervised neural network for enhancing the quality of healthcare data
    K. Shobha
    Nickolas Savarimuthu
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2021, 12 : 1771 - 1781
  • [36] Application of learning analytics using clustering data Mining for Students’ disposition analysis
    Sanyam Bharara
    Sai Sabitha
    Abhay Bansal
    [J]. Education and Information Technologies, 2018, 23 : 957 - 984
  • [37] Using Data Mining on Students' Learning Features: A Clustering Approach for Student Classification
    Zhou, Xiaolan
    An, Jianqi
    Zhao, Xin
    Dong, Yuanxing
    [J]. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2016, 20 (07) : 1141 - 1146
  • [38] Application of learning analytics using clustering data Mining for Students' disposition analysis
    Bharara, Sanyam
    Sabitha, Sai
    Bansal, Abhay
    [J]. EDUCATION AND INFORMATION TECHNOLOGIES, 2018, 23 (02) : 957 - 984
  • [39] Unsupervised learning for hierarchical clustering using statistical information
    Okamoto, M
    Bu, N
    Tsuji, T
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2004, PT 1, 2004, 3173 : 834 - 839
  • [40] Clustering Seismocardiographic Events using Unsupervised Machine Learning
    Gamage, Peshala T.
    Azad, Md Khurshidul.
    Taebi, Amirtaha
    Sandler, Richard H.
    Mansy, Hansen A.
    [J]. 2018 IEEE SIGNAL PROCESSING IN MEDICINE AND BIOLOGY SYMPOSIUM (SPMB), 2018,