Dimensionality Reduction on Heterogeneous Feature Space

被引:12
|
作者
Shi, Xiaoxiao [1 ]
Yu, Philip [1 ]
机构
[1] Univ Illinois, Dept Comp Sci, Chicago, IL 60680 USA
关键词
component; formatting; style; styling;
D O I
10.1109/ICDM.2012.30
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Combining correlated data sources may help improve the learning performance of a given task. For example, in recommendation problems, one can combine (1) user profile database (e. g. genders, age, etc.), (2) users' log data (e. g., clickthrough data, purchasing records, etc.), and (3) users' social network (useful in social targeting) to build a recommendation model. All these data sources provide informative but heterogeneous features. For instance, user profile database usually has nominal features reflecting users' background, log data provides term-based features about users' historical behaviors, and social network database has graph relational features. Given multiple heterogeneous data sources, one important challenge is to find a unified feature subspace that captures the knowledge from all sources. To this aim, we propose a principle of collective component analysis (CoCA), in order to handle dimensionality reduction across a mixture of vector-based features and graph relational features. The CoCA principle is to find a feature subspace with maximal variance under two constraints. First, there should be consensus among the projections from different feature spaces. Second, the similarity between connected data (in any of the network databases) should be maximized. The optimal solution is obtained by solving an eigenvalue problem. Moreover, we discuss how to use prior knowledge to distinguish informative data sources, and optimally weight them in CoCA. Since there is no previous model that can be directly applied to solve the problem, we devised a straightforward comparison method by performing dimension reduction on the concatenation of the data sources. Three sets of experiments show that CoCA substantially outperforms the comparison method.
引用
收藏
页码:635 / 644
页数:10
相关论文
共 50 条
  • [1] FEATURE SPACE DIMENSIONALITY REDUCTION FOR THE OPTIMIZATION OF VISUALIZATION METHODS
    Griparis, Andreea
    Faur, Daniela
    Datcu, Mihai
    [J]. 2015 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2015, : 1120 - 1123
  • [2] Robust dimensionality reduction via feature space to feature space distance metric learning
    Li, Bo
    Fan, Zhang-Tao
    Zhang, Xiao-Long
    Huang, De-Shuang
    [J]. NEURAL NETWORKS, 2019, 112 : 1 - 14
  • [3] Analysis of approaches to feature space partitioning for nonlinear dimensionality reduction
    Myasnikov E.V.
    [J]. Myasnikov, E.V. (mevg@geosamara.ru), 1600, Izdatel'stvo Nauka (26): : 474 - 482
  • [4] Semi-Supervised Dimensionality Reduction in Image Feature Space
    Cheng, Hao
    Hua, Kien A.
    Vu, Khanh
    Liu, Danzhou
    [J]. APPLIED COMPUTING 2008, VOLS 1-3, 2008, : 1207 - 1211
  • [5] Feature selection for dimensionality reduction
    Mladenic, Dunja
    [J]. SUBSPACE, LATENT STRUCTURE AND FEATURE SELECTION, 2006, 3940 : 84 - 102
  • [6] Feature dimensionality reduction: a review
    Jia, Weikuan
    Sun, Meili
    Lian, Jian
    Hou, Sujuan
    [J]. COMPLEX & INTELLIGENT SYSTEMS, 2022, 8 (03) : 2663 - 2693
  • [7] Feature dimensionality reduction: a review
    Weikuan Jia
    Meili Sun
    Jian Lian
    Sujuan Hou
    [J]. Complex & Intelligent Systems, 2022, 8 : 2663 - 2693
  • [8] THE CHOICE OF A METHOD FOR FEATURE SPACE DECOMPOSITION FOR NON-LINEAR DIMENSIONALITY REDUCTION
    Myasnikov, E. V.
    [J]. COMPUTER OPTICS, 2014, 38 (04) : 790 - 797
  • [9] Transfer Learning for Feature Dimensionality Reduction
    Thribhuvan, Nikhila
    Elayidom, Sudheep
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2022, 19 (05) : 721 - 727
  • [10] A Feature Fusion Technique for Dimensionality Reduction
    Myasnikov, E.
    [J]. PATTERN RECOGNITION AND IMAGE ANALYSIS, 2022, 32 (03) : 607 - 610