Dimensionality Reduction on Heterogeneous Feature Space

被引:12
|
作者
Shi, Xiaoxiao [1 ]
Yu, Philip [1 ]
机构
[1] Univ Illinois, Dept Comp Sci, Chicago, IL 60680 USA
关键词
component; formatting; style; styling;
D O I
10.1109/ICDM.2012.30
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Combining correlated data sources may help improve the learning performance of a given task. For example, in recommendation problems, one can combine (1) user profile database (e. g. genders, age, etc.), (2) users' log data (e. g., clickthrough data, purchasing records, etc.), and (3) users' social network (useful in social targeting) to build a recommendation model. All these data sources provide informative but heterogeneous features. For instance, user profile database usually has nominal features reflecting users' background, log data provides term-based features about users' historical behaviors, and social network database has graph relational features. Given multiple heterogeneous data sources, one important challenge is to find a unified feature subspace that captures the knowledge from all sources. To this aim, we propose a principle of collective component analysis (CoCA), in order to handle dimensionality reduction across a mixture of vector-based features and graph relational features. The CoCA principle is to find a feature subspace with maximal variance under two constraints. First, there should be consensus among the projections from different feature spaces. Second, the similarity between connected data (in any of the network databases) should be maximized. The optimal solution is obtained by solving an eigenvalue problem. Moreover, we discuss how to use prior knowledge to distinguish informative data sources, and optimally weight them in CoCA. Since there is no previous model that can be directly applied to solve the problem, we devised a straightforward comparison method by performing dimension reduction on the concatenation of the data sources. Three sets of experiments show that CoCA substantially outperforms the comparison method.
引用
收藏
页码:635 / 644
页数:10
相关论文
共 50 条
  • [31] Image feature optimization based on nonlinear dimensionality reduction
    Zhu, Rong
    Yao, Min
    [J]. JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE A, 2009, 10 (12): : 1720 - 1737
  • [32] Interpretable Discriminative Dimensionality Reduction and Feature Selection on the Manifold
    Hosseini, Babak
    Hammer, Barbara
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT I, 2020, 11906 : 310 - 326
  • [33] Feature dimensionality reduction via homological properties of observability
    Trovati, Marcello
    Farsimadan, Eslam
    [J]. EVOLVING SYSTEMS, 2024, 15 (01) : 57 - 63
  • [34] Feature dimensionality reduction via homological properties of observability
    Marcello Trovati
    Eslam Farsimadan
    [J]. Evolving Systems, 2024, 15 : 57 - 63
  • [35] Stepwise optimal feature selection for data dimensionality reduction
    Qin, Lifeng
    He, Dongjian
    Long, Yan
    [J]. Journal of Computational Information Systems, 2015, 11 (05): : 1647 - 1656
  • [36] The Dimensionality Reduction of Feature Vectors by Generalized Cross Product
    Wei Jinwen
    Guo Junjie
    Wei Jinwen
    Chen Yanling
    [J]. ISCSCT 2008: INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND COMPUTATIONAL TECHNOLOGY, VOL 2, PROCEEDINGS, 2008, : 281 - +
  • [37] Multiclass spectral feature scaling method for dimensionality reduction
    Matsuda, Momo
    Morikuni, Keiichi
    Imakura, Akira
    Ye, Xiucai
    Sakurai, Tetsuya
    [J]. INTELLIGENT DATA ANALYSIS, 2020, 24 (06) : 1273 - 1287
  • [38] Image feature optimization based on nonlinear dimensionality reduction
    Rong Zhu
    Min Yao
    [J]. Journal of Zhejiang University-SCIENCE A, 2009, 10 : 1720 - 1737
  • [39] Feature clustering dimensionality reduction based on affinity propagation
    Zhang, Yahong
    Li, Yujian
    Zhang, Ting
    Gadosey, Pius Kwao
    Liu, Zhaoying
    [J]. INTELLIGENT DATA ANALYSIS, 2018, 22 (02) : 309 - 323
  • [40] Dimensionality Reduction on Metagenomic Data with Recursive Feature Elimination
    Huong Hoang Luong
    Nghia Trong Le Phan
    Tin Tri Duong
    Thuan Minh Dang
    Tong Duc Nguyen
    Hai Thanh Nguyen
    [J]. COMPLEX, INTELLIGENT AND SOFTWARE INTENSIVE SYSTEMS, CISIS-2021, 2021, 278 : 68 - 79