TREELETS - AN ADAPTIVE MULTI-SCALE BASIS FOR SPARSE UNORDERED DATA

被引:82
|
作者
Lee, Ann B. [1 ]
Nadler, Boaz [2 ]
Wasserman, Larry [1 ]
机构
[1] Carnegie Mellon Univ, Dept Stat, Pittsburgh, PA 15213 USA
[2] Weizmann Inst Sci, Dept Comp Sci & Appl Math, IL-76100 Rehovot, Israel
来源
ANNALS OF APPLIED STATISTICS | 2008年 / 2卷 / 02期
关键词
Feature selection; dimensionality reduction; multi-resolution analysis; local best basis. sparsity; principal component analysis; hierarchical clusetering; small smaple sizes;
D O I
10.1214/07-AOAS137
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In many modern applications, including analysis of gene expression and texts documents, the data are noisy, high-dimensional, and unordered-with no particular meaning to the given order of the variables. Yet, successful learning is often possible due to sparsity: the fact that the data are typically redundant with underlying structures that can be represented by only a few features. In this paper we present treelets-a novel construction of multi-scale bases that extends wavelets to nonsmooth signals. The method is fully adaptive, as it returns a hierarchical tree and an orthonormal basis which both reflect the internal structure of the data. Treelets are especially well-suited as dimensionality reduction and feature selection tool prior to regression and classification, in situations where sample sizes are small and the data are sparse with unknown groupings of correlated or collinear variables. The method is also simple to implement and analyze theoretically. Here we describe a variety of situations where treelets perform better than principal component analysis, as well as some common variable selection and cluster averaging schemes. We illustrate treelets on a blocked covariance model and on several data sets (hyperspectral image data, DNA microarray data, and internet advertisements) with highly complex dependencies between variables.
引用
收藏
页码:435 / 471
页数:37
相关论文
共 50 条
  • [1] DISCUSSION OF: TREELETS - AN ADAPTIVE MULTI-SCALE BASIS FOR SPARSE UNORDERED DATA
    Tibshirani, Robert
    [J]. ANNALS OF APPLIED STATISTICS, 2008, 2 (02): : 482 - 483
  • [2] REJOINDER OF: TREELETS - AN ADAPTIVE MULTI-SCALE BASIS FOR SPARE UNORDERED DATA
    Lee, Ann B.
    Nadler, Boaz
    Wasserman, Larry
    [J]. ANNALS OF APPLIED STATISTICS, 2008, 2 (02): : 494 - 500
  • [3] Multi-scale Sparse Domination
    Beltran, David
    Roos, Joris
    Seeger, Andreas
    [J]. MEMOIRS OF THE AMERICAN MATHEMATICAL SOCIETY, 2024, 298 (1491)
  • [4] Sparse PCA - Extracting multi-scale structure from data
    Chennubhotla, C
    Jepson, A
    [J]. EIGHTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOL I, PROCEEDINGS, 2001, : 641 - 647
  • [5] Data adaptive multi-scale representations for image analysis
    Dobrosotskaya, Julia
    Guo, Weihong
    [J]. WAVELETS AND SPARSITY XVIII, 2019, 11138
  • [6] MULTI-SCALE KERNEL BASIS AND ITERATIVE ORTHOGONAL MATCHING PURSUIT FOR SPARSE APPROXIMATION
    Xie, Zhi-Peng
    Chen, Song-Can
    Wu, Yang-Yang
    Chen, Duan-Sheng
    [J]. PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 1765 - +
  • [7] Adaptive Enhancement of Robot Vision Image on the basis of Multi-Scale Filter
    Liu, Xin
    Zhang, Bin
    [J]. Engineering Intelligent Systems, 2023, 31 (04): : 255 - 263
  • [8] Dim target detection method based on multi-scale adaptive sparse dictionary
    [J]. Wang, Huigai, 1600, Chinese Society of Astronautics (43):
  • [9] Multi-scale Adaptive Dehazing Network
    Chen, Shuxin
    Chen, Yizi
    Qu, Yanyun
    Huang, Jingying
    Hong, Ming
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 2051 - 2059
  • [10] Data-Driven Depth Map Refinement via Multi-scale Sparse Representation
    Kwon, HyeokHyen
    Tai, Yu-Wing
    Lin, Stephen
    [J]. 2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 159 - 167