Multi-instance multi-label distance metric learning for genome-wide protein function prediction

被引:14
|
作者
Xu, Yonghui [1 ]
Min, Huaqing [2 ]
Song, Hengjie [2 ]
Wu, Qingyao [2 ]
机构
[1] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510006, Guangdong, Peoples R China
[2] South China Univ Technol, Sch Software Engn, Guangzhou 510006, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
Protein function prediction; Genome wide; Distance metric learning; Machine learning; Multi-instance multi-label learning;
D O I
10.1016/j.compbiolchem.2016.02.011
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Multi-instance multi-label (MIML) learning has been proven to be effective for the genome-wide protein function prediction problems where each training example is associated with not only multiple instances but also multiple class labels. To find an appropriate MIML learning method for genome-wide protein function prediction, many studies in the literature attempted to optimize objective functions in which dissimilarity between instances is measured using the Euclidean distance. But in many real applications, Euclidean distance may be unable to capture the intrinsic similarity/dissimilarity in feature space and label space. Unlike other previous approaches, in this paper, we propose to learn a multi-instance multi label distance metric learning framework (MIMLDML) for genome-wide protein function prediction. Specifically, we learn a Mahalanobis distance to preserve and utilize the intrinsic geometric information of both feature space and label space for MIML learning. In addition, we try to deal with the sparsely labeled data by giving weight to the labeled data. Extensive experiments on seven real-world organisms covering the biological three-domain system (Le., archaea, bacteria, and eukaryote; Woese et al., 1990) show that the MIMLDML algorithm is superior to most state-of-the-art MIML learning algorithms. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:30 / 40
页数:11
相关论文
共 50 条
  • [1] Genome-Wide Protein Function Prediction through Multi-Instance Multi-Label Learning
    Wu, Jian-Sheng
    Huang, Sheng-Jun
    Zhou, Zhi-Hua
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2014, 11 (05) : 891 - 902
  • [2] Multi-Instance Metric Transfer Learning for Genome-Wide Protein Function Prediction
    Xu, Yonghui
    Min, Huaqing
    Wu, Qingyao
    Song, Hengjie
    Ye, Bicui
    SCIENTIFIC REPORTS, 2017, 7
  • [3] Multi-Instance Metric Transfer Learning for Genome-Wide Protein Function Prediction
    Yonghui Xu
    Huaqing Min
    Qingyao Wu
    Hengjie Song
    Bicui Ye
    Scientific Reports, 7
  • [4] Online Multi-Instance Multi-Label Learning for Protein Function Prediction
    Wu, Feng
    Liu, Qiong
    Hao, Tianyong
    Chen, Xiaojun
    Wu, Qingyao
    2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 780 - 785
  • [5] Learning a Distance Metric from Multi-instance Multi-label Data
    Jin, Rong
    Wang, Shijun
    Zhou, Zhi-Hua
    CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 896 - +
  • [6] A Robust Distance with Correlated Metric Learning for Multi-Instance Multi-Label Data
    Verma, Yashaswi
    Jawahar, C. V.
    MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE, 2016, : 441 - 445
  • [7] Multi-instance multi-label learning
    Zhou, Zhi-Hua
    Zhang, Min-Ling
    Huang, Sheng-Jun
    Li, Yu-Feng
    ARTIFICIAL INTELLIGENCE, 2012, 176 (01) : 2291 - 2320
  • [8] Instance Annotation for Multi-Instance Multi-Label Learning
    Briggs, Forrest
    Fern, Xiaoli Z.
    Raich, Raviv
    Lou, Qi
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2013, 7 (03)
  • [9] Learnability of multi-instance multi-label learning
    Wang Wei
    Zhou ZhiHua
    CHINESE SCIENCE BULLETIN, 2012, 57 (19): : 2488 - 2491
  • [10] Learnability of multi-instance multi-label learning
    WANG Wei & ZHOU ZhiHua National Key Laboratory for Novel Software Technology
    ChineseScienceBulletin, 2012, 57 (19) : 2492 - 2495