GMDH-based semi-supervised feature selection for customer classification

被引:44
|
作者
Xiao, Jin [1 ,2 ]
Cao, Hanwen [1 ]
Jiang, Xiaoyi [3 ]
Gu, Xin [1 ,2 ]
Xie, Ling [1 ,2 ]
机构
[1] Sichuan Univ, Business Sch, Chengdu 610064, Sichuan, Peoples R China
[2] Sichuan Univ, Soft Sci Inst, Chengdu 610064, Sichuan, Peoples R China
[3] Univ Munster, Dept Math & Comp Sci, Einsteinstr 62, D-48149 Munster, Germany
基金
中国国家自然科学基金;
关键词
Feature selection; Group method of data handling (GMDH); Customer classification; Semi-supervised learning; CHURN PREDICTION; OBJECT DETECTION; NEURAL-NETWORKS; ALGORITHMS; CONSTRAINT; RELEVANCE; SYSTEM;
D O I
10.1016/j.knosys.2017.06.018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data dimension reduction is an important step for customer classification modeling, and feature selection has been a research focus of the data dimension reduction field. This study introduces the group method of data handling (GMDH), puts forward a GMDH-based semi-supervised feature selection (GMDH-SSFS) algorithm, and applies it to customer feature selection. The algorithm can utilize a few samples with class labels L, and a large number of samples without class labels U simultaneously. What is more, it considers the relationship between features and class labels, and that between features during feature selection. The GMDH-SSFS model mainly consists of three stages: 1) Train N basic classification models based on the dataset L with class labels; 2) Label samples selectively in the dataset U without class labels, and add them to L; 3) Train the GMDH neural network based on the new training set L, and select the optimal feature subset Fs. Based on an empirical analysis of four customer classification datasets, results suggest that the features selected by the GMDH-SSFS model have a good explainability. Meanwhile, the customer classification performance of the classification model trained by the selected feature subset is superior to that of the models trained by the commonly used Laplacian score (an unsupervised feature selection algorithm), Fisher score (a supervised feature selection algorithm), and the FW-SemiFS and S3VM-FS (two semi-supervised feature selection algorithms). (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:236 / 248
页数:13
相关论文
共 50 条
  • [21] Joint Semi-Supervised Feature Selection and Classification through Bayesian Approach
    Jiang, Bingbing
    Wu, Xingyu
    Yu, Kui
    Chen, Huanhuan
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 3983 - 3990
  • [22] Manifold Based Fisher Method for Semi-Supervised Feature Selection
    Lv, Sunzhong
    Jiang, Hongxing
    Zhao, Li
    Wang, Di
    Fan, Mingyu
    2013 10TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2013, : 664 - 668
  • [23] Clustering-based Feature Selection in Semi-supervised Problems
    Quinzan, Ianisse
    Sotoca, Jose M.
    Pla, Filiberto
    2009 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, 2009, : 535 - 540
  • [24] Semi-supervised Classification of Emotional Pictures Based on Feature Combination
    Li, Shuo
    Zhang, Yu-Jin
    MULTIMEDIA ON MOBILE DEVICES 2011 AND MULTIMEDIA CONTENT ACCESS: ALGORITHMS AND SYSTEMS V, 2011, 7881
  • [25] Semi-Supervised Local-Learning-based Feature Selection
    Wang, Jim Jing-Yan
    Yao, Jin
    Sun, Yijun
    PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 1942 - 1948
  • [26] A semi-supervised network based on feature embeddings for image classification
    Nuhoho, Raphael Elimeli
    Chen Wenyu
    Baffour, Adu Asare
    EXPERT SYSTEMS, 2022, 39 (04)
  • [27] Semi-supervised feature selection based on local discriminative information
    Zeng, Zhiqiang
    Wang, Xiaodong
    Zhang, Jian
    Wu, Qun
    NEUROCOMPUTING, 2016, 173 : 102 - 109
  • [28] Semi-supervised feature selection based on fuzzy related family
    Guo, Zhijun
    Shen, Yang
    Yang, Tian
    Li, Yuan-Jiang
    Deng, Yanfang
    Qian, Yuhua
    INFORMATION SCIENCES, 2024, 652
  • [29] Semi-supervised sentiment classification based on sentiment feature clustering
    Li, Suke
    Jiang, Yanbing
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2013, 50 (12): : 2570 - 2577
  • [30] Ensemble-Based Feature Ranking for Semi-supervised Classification
    Petkovic, Matej
    Dzeroski, Saso
    Kocev, Dragi
    DISCOVERY SCIENCE (DS 2019), 2019, 11828 : 290 - 305