Study of transductive learning and unsupervised feature construction methods for biological sequence classification

被引:0
|
作者
Stanescu, Ana [1 ]
Tangirala, Karthik [1 ]
Caragea, Doina [1 ]
机构
[1] Kansas State Univ, Comp & Informat Sci, Manhattan, KS 66506 USA
关键词
SUBCELLULAR-LOCALIZATION; PREDICTION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Next Generation Sequencing (NGS) technologies have led to fast and inexpensive production of large amounts of biological sequence data, including nucleotide sequences and derived protein sequences. These fast-increasing volumes of data pose challenges to computational methods for annotation. Machine learning approaches, primarily supervised algorithms, have been widely used to assist with classification tasks in bioinformatics. However, supervised algorithms rely on large amounts of labeled data in order to produce quality predictors. Oftentimes, labeled data is difficult and expensive to acquire in sufficiently large quantities. When only limited amounts of labeled data but considerably larger amounts of unlabeled data are available for a specific annotation problem, semi-supervised learning approaches represent a cost-effective alternative. In this work, we focus on a special case of semi-supervised learning, namely transductive learning, in which the algorithm has access during the training phase to the instances that need to be labeled. Transduction is particularly suitable for biological sequence classification, where the goal is generally to label a given set of unlabeled instances. However, a challenge that needs to be addressed in this context consists of identification of compact sets of informative features. Given the lack of labeled data, standard supervised feature selection methods may result in unreliable features. Therefore, we study recently proposed unsupervised feature construction approaches together with transductive learning. Experimental results on two classification problems, namely cassette exon identification and protein localization, show that the unsupervised features result in better performance than the supervised features.
引用
收藏
页码:999 / 1006
页数:8
相关论文
共 50 条
  • [21] UNSUPERVISED DEEP TRANSFER FEATURE LEARNING FOR MEDICAL IMAGE CLASSIFICATION
    Ahn, Euijoon
    Kumar, Ashnil
    Feng, Dagan
    Fulham, Michael
    Kim, Jinman
    2019 IEEE 16TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2019), 2019, : 1915 - 1918
  • [22] Automatic Classification of Turner Syndrome Using Unsupervised Feature Learning
    Liu, Lu
    Sun, Jingchao
    Li, Jianqiang
    Pei, Yan
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 1578 - 1583
  • [23] Meibography Phenotyping and Classification From Unsupervised Discriminative Feature Learning
    Yeh, Chun-Hsiao
    Yu, Stella X.
    Lin, Meng C.
    TRANSLATIONAL VISION SCIENCE & TECHNOLOGY, 2021, 10 (02): : 1 - 11
  • [24] Saliency-Guided Unsupervised Feature Learning for Scene Classification
    Zhang, Fan
    Du, Bo
    Zhang, Liangpei
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2015, 53 (04): : 2175 - 2184
  • [25] Unsupervised feature learning and automatic modulation classification using deep learning model
    Ali, Afan
    Fan Yangyu
    PHYSICAL COMMUNICATION, 2017, 25 : 75 - 84
  • [26] Comparative study of several feature transformation and learning methods for phoneme classification
    Kocsor A.
    Tóth L.
    Kuba A.
    Kovács K.
    Jelasity M.
    Gyimóthy T.
    Csirik J.
    International Journal of Speech Technology, 2000, 3 (3-4) : 263 - 276
  • [27] Learning interpretable SVMs for biological sequence classification
    Rätsch, G
    Sonnenburg, S
    Schäfer, C
    BMC BIOINFORMATICS, 2006, 7 (Suppl 1)
  • [28] Learning Interpretable SVMs for Biological Sequence Classification
    Gunnar Rätsch
    Sören Sonnenburg
    Christin Schäfer
    BMC Bioinformatics, 7
  • [29] Learning interpretable SVMs for biological sequence classification
    Sonnenburg, S
    Rätsch, G
    Schäfer, C
    RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY, PROCEEDINGS, 2005, 3500 : 389 - 407
  • [30] An Association Rule based Approach for Biological Sequence Feature Classification
    Becerra, David
    Vanegas, Diana
    Cantor, Giovanni
    Nino, Luis
    2009 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-5, 2009, : 3111 - 3118