Nonlinear dimension reduction and clustering by Minimum Curvilinearity unfold neuropathic pain and tissue embryological classes

被引:53
|
作者
Cannistraci, Carlo Vittorio [1 ,2 ,3 ,4 ,5 ,6 ]
Ravasi, Timothy [1 ,5 ,6 ]
Montevecchi, Franco Maria [3 ]
Ideker, Trey [5 ,6 ]
Alessio, Massimo [2 ]
机构
[1] King Abdullah Univ Sci & Technol, Red Sea Integrat Syst Biol Lab, Computat Biosci Res Ctr, Div Chem & Life Sci & Engn, Jeddah, Saudi Arabia
[2] Ist Sci San Raffaele, I-20132 Milan, Italy
[3] Politecn Torino, Dept Mech, I-10129 Turin, Italy
[4] Politecn Torino, CMP Grp, Microsoft Res, I-10129 Turin, Italy
[5] Univ Calif San Diego, Dept Bioengn, La Jolla, CA 92093 USA
[6] Univ Calif San Diego, Dept Med, La Jolla, CA 92093 USA
关键词
CLASSIFICATION; MECHANISMS; EXPRESSION; DISEASE;
D O I
10.1093/bioinformatics/btq376
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Nonlinear small datasets, which are characterized by low numbers of samples and very high numbers of measures, occur frequently in computational biology, and pose problems in their investigation. Unsupervised hybrid-two-phase (H2P) procedures-specifically dimension reduction (DR), coupled with clustering-provide valuable assistance, not only for unsupervised data classification, but also for visualization of the patterns hidden in high-dimensional feature space. Methods: 'Minimum Curvilinearity' (MC) is a principle that-for small datasets-suggests the approximation of curvilinear sample distances in the feature space by pair-wise distances over their minimum spanning tree (MST), and thus avoids the introduction of any tuning parameter. MC is used to design two novel forms of nonlinear machine learning (NML): Minimum Curvilinear embedding (MCE) for DR, and Minimum Curvilinear affinity propagation (MCAP) for clustering. Results: Compared with several other unsupervised and supervised algorithms, MCE and MCAP, whether individually or combined in H2P, overcome the limits of classical approaches. High performance was attained in the visualization and classification of: (i) pain patients (proteomic measurements) in peripheral neuropathy; (ii) human organ tissues (genomic transcription factor measurements) on the basis of their embryological origin. Conclusion: MC provides a valuable framework to estimate nonlinear distances in small datasets. Its extension to large datasets is prefigured for novel NMLs. Classification of neuropathic pain by proteomic profiles offers new insights for future molecular and systems biology characterization of pain. Improvements in tissue embryological classification refine results obtained in an earlier study, and suggest a possible reinterpretation of skin attribution as mesodermal.
引用
收藏
页码:i531 / i539
页数:9
相关论文
共 4 条
  • [1] NONLINEAR DIMENSION REDUCTION FOR FUNCTIONAL DATA WITH APPLICATION TO CLUSTERING
    Tan, Ruoxu
    Zang, Yiming
    Yin, Guosheng
    [J]. STATISTICA SINICA, 2024, 34 (03) : 1391 - 1412
  • [2] Visual clustering of complex network based on nonlinear dimension reduction
    Li, Jianyu
    Yang, Shuzhong
    [J]. INTELLIGENT INFORMATION PROCESSING III, 2006, 228 : 555 - +
  • [3] Nonlinear dimensionality reduction of gene expression data for visualization and clustering analysis of cancer tissue samples
    Shi, Jinlong
    Luo, Zhigang
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2010, 40 (08) : 723 - 732
  • [4] PCA-SIR: A New Nonlinear Supervised Dimension Reduction Method with Application to Pain Prediction from EEG
    Tu, Yiheng
    Hung, Yeung Sam
    Hu, Li
    Zhang, Zhiguo
    [J]. 2015 7TH INTERNATIONAL IEEE/EMBS CONFERENCE ON NEURAL ENGINEERING (NER), 2015, : 1004 - 1007