An Unsupervised Feature Selection Framework for Social Media Data

被引:48
|
作者
Tang, Jiliang [1 ]
Liu, Huan [1 ]
机构
[1] Arizona State Univ, Dept Comp Sci, Tempe, AZ 85281 USA
基金
美国国家科学基金会;
关键词
Unsupervised feature selection; linked data; social media; pseudo labels; social dimension regularization;
D O I
10.1109/TKDE.2014.2320728
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The explosive usage of social media produces massive amount of unlabeled and high-dimensional data. Feature selection has been proven to be effective in dealing with high-dimensional data for efficient learning and data mining. Unsupervised feature selection remains a challenging task due to the absence of label information based on which feature relevance is often assessed. The unique characteristics of social media data further complicate the already challenging problem of unsupervised feature selection, e. g., social media data is inherently linked, which makes invalid the independent and identically distributed assumption, bringing about new challenges to unsupervised feature selection algorithms. In this paper, we investigate a novel problem of feature selection for social media data in an unsupervised scenario. In particular, we analyze the differences between social media data and traditional attribute-value data, investigate how the relations extracted from linked data can be exploited to help select relevant features, and propose a novel unsupervised feature selection framework, LUFS, for linked social media data. We systematically design and conduct systemic experiments to evaluate the proposed framework on data sets from real-world social media websites. The empirical study demonstrates the effectiveness and potential of our proposed framework.
引用
收藏
页码:2914 / 2927
页数:14
相关论文
共 50 条
  • [21] Unsupervised Feature Selection with Feature Clustering
    Cheung, Yiu-ming
    Jia, Hong
    2012 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2012), VOL 1, 2012, : 9 - 15
  • [22] Joint Embedding Learning and Sparse Regression: A Framework for Unsupervised Feature Selection
    Hou, Chenping
    Nie, Feiping
    Li, Xuelong
    Yi, Dongyun
    Wu, Yi
    IEEE TRANSACTIONS ON CYBERNETICS, 2014, 44 (06) : 793 - 804
  • [23] Robust unsupervised feature selection via data relationship learning
    Huang, Pei
    Kong, Zhaoming
    Xie, Mengying
    Yang, Xiaowei
    PATTERN RECOGNITION, 2023, 142
  • [24] Unsupervised spectral feature selection algorithms for high dimensional data
    Wang, Mingzhao
    Han, Henry
    Huang, Zhao
    Xie, Juanying
    FRONTIERS OF COMPUTER SCIENCE, 2023, 17 (05)
  • [25] UNSUPERVISED LEARNING APPROACH TO FEATURE SELECTION IN BIOLOGICAL DATA ANALYSIS
    Jacak, Witold
    Proell, Karin
    24TH EUROPEAN MODELING AND SIMULATION SYMPOSIUM (EMSS 2012), 2012, : 232 - 236
  • [26] Unsupervised Feature Selection for Efficient Exploration of High Dimensional Data
    Chakrabarti, Arnab
    Das, Abhijeet
    Cochez, Michael
    Quix, Christoph
    ADVANCES IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2021, 2021, 12843 : 183 - 197
  • [27] Improved Data Streams Classification with Fast Unsupervised Feature Selection
    Wang, Lulu
    Shen, Hong
    2016 17TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES (PDCAT), 2016, : 221 - 226
  • [28] Unsupervised spectral feature selection algorithms for high dimensional data
    Mingzhao WANG
    Henry HAN
    Zhao HUANG
    Juanying XIE
    Frontiers of Computer Science, 2023, 17 (05) : 31 - 44
  • [29] Unsupervised Cross-View Feature Selection on incomplete data
    Xu, Yuanyuan
    Yin, Yu
    Wang, Jun
    Wei, Jinmao
    Liu, Jian
    Yao, Lina
    Zhang, Wenjie
    KNOWLEDGE-BASED SYSTEMS, 2021, 234
  • [30] Unsupervised Feature Selection via Data Reconstruction and Side Information
    Zhang, Rui
    Li, Xuelong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 (29) : 8097 - 8106