An Unsupervised Feature Selection Framework for Social Media Data

被引:48
|
作者
Tang, Jiliang [1 ]
Liu, Huan [1 ]
机构
[1] Arizona State Univ, Dept Comp Sci, Tempe, AZ 85281 USA
基金
美国国家科学基金会;
关键词
Unsupervised feature selection; linked data; social media; pseudo labels; social dimension regularization;
D O I
10.1109/TKDE.2014.2320728
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The explosive usage of social media produces massive amount of unlabeled and high-dimensional data. Feature selection has been proven to be effective in dealing with high-dimensional data for efficient learning and data mining. Unsupervised feature selection remains a challenging task due to the absence of label information based on which feature relevance is often assessed. The unique characteristics of social media data further complicate the already challenging problem of unsupervised feature selection, e. g., social media data is inherently linked, which makes invalid the independent and identically distributed assumption, bringing about new challenges to unsupervised feature selection algorithms. In this paper, we investigate a novel problem of feature selection for social media data in an unsupervised scenario. In particular, we analyze the differences between social media data and traditional attribute-value data, investigate how the relations extracted from linked data can be exploited to help select relevant features, and propose a novel unsupervised feature selection framework, LUFS, for linked social media data. We systematically design and conduct systemic experiments to evaluate the proposed framework on data sets from real-world social media websites. The empirical study demonstrates the effectiveness and potential of our proposed framework.
引用
收藏
页码:2914 / 2927
页数:14
相关论文
共 50 条
  • [1] Feature Selection for Social Media Data
    Tang, Jiliang
    Liu, Huan
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2014, 8 (04)
  • [2] An efficient framework for unsupervised feature selection
    Zhang, Han
    Zhang, Rui
    Nie, Feiping
    Li, Xuelong
    NEUROCOMPUTING, 2019, 366 : 194 - 207
  • [3] Unsupervised Spectral Sparse Regression Feature Selection using Social Media Datasets
    Krishna, R. Sathya Bama
    Aramudhan, M.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATICS AND ANALYTICS (ICIA' 16), 2016,
  • [4] Unsupervised group feature selection for media classification
    Zaharieva M.
    Breiteneder C.
    Hudec M.
    International Journal of Multimedia Information Retrieval, 2017, 6 (3) : 233 - 249
  • [5] Unsupervised feature selection for text data
    Wiratunga, Nirmalie
    Lothian, Rob
    Massie, Stewart
    ADVANCES IN CASE-BASED REASONING, PROCEEDINGS, 2006, 4106 : 340 - 354
  • [6] Unsupervised Feature Selection for Linked Data
    Nemade, Rachana T.
    Makhijani, Richa
    2014 RECENT ADVANCES AND INNOVATIONS IN ENGINEERING (ICRAIE), 2014,
  • [7] Unsupervised Feature Selection for Noisy Data
    Mahdavi, Kaveh
    Labarta, Jesus
    Gimenez, Judit
    ADVANCED DATA MINING AND APPLICATIONS, ADMA 2019, 2019, 11888 : 79 - 94
  • [8] Unsupervised Feature Selection in Signed Social Networks
    Cheng, Kewei
    Li, Jundong
    Liu, Huan
    KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, : 777 - 786
  • [9] Seeding and Harvest: A Framework for Unsupervised Feature Selection Problems
    Chen, Gang
    Cai, Yuanli
    Shi, Juan
    SENSORS, 2013, 13 (01) : 292 - 333
  • [10] Unsupervised Feature Selection Algorithm for Dynamic Network Media Data Based on User Correlation
    Ren Y.-G.
    Wang Y.-L.
    Liu Y.
    Zhang J.
    Jisuanji Xuebao/Chinese Journal of Computers, 2018, 41 (07): : 1517 - 1535