GMDH-based semi-supervised feature selection for customer classification

被引:44
|
作者
Xiao, Jin [1 ,2 ]
Cao, Hanwen [1 ]
Jiang, Xiaoyi [3 ]
Gu, Xin [1 ,2 ]
Xie, Ling [1 ,2 ]
机构
[1] Sichuan Univ, Business Sch, Chengdu 610064, Sichuan, Peoples R China
[2] Sichuan Univ, Soft Sci Inst, Chengdu 610064, Sichuan, Peoples R China
[3] Univ Munster, Dept Math & Comp Sci, Einsteinstr 62, D-48149 Munster, Germany
基金
中国国家自然科学基金;
关键词
Feature selection; Group method of data handling (GMDH); Customer classification; Semi-supervised learning; CHURN PREDICTION; OBJECT DETECTION; NEURAL-NETWORKS; ALGORITHMS; CONSTRAINT; RELEVANCE; SYSTEM;
D O I
10.1016/j.knosys.2017.06.018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data dimension reduction is an important step for customer classification modeling, and feature selection has been a research focus of the data dimension reduction field. This study introduces the group method of data handling (GMDH), puts forward a GMDH-based semi-supervised feature selection (GMDH-SSFS) algorithm, and applies it to customer feature selection. The algorithm can utilize a few samples with class labels L, and a large number of samples without class labels U simultaneously. What is more, it considers the relationship between features and class labels, and that between features during feature selection. The GMDH-SSFS model mainly consists of three stages: 1) Train N basic classification models based on the dataset L with class labels; 2) Label samples selectively in the dataset U without class labels, and add them to L; 3) Train the GMDH neural network based on the new training set L, and select the optimal feature subset Fs. Based on an empirical analysis of four customer classification datasets, results suggest that the features selected by the GMDH-SSFS model have a good explainability. Meanwhile, the customer classification performance of the classification model trained by the selected feature subset is superior to that of the models trained by the commonly used Laplacian score (an unsupervised feature selection algorithm), Fisher score (a supervised feature selection algorithm), and the FW-SemiFS and S3VM-FS (two semi-supervised feature selection algorithms). (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:236 / 248
页数:13
相关论文
共 50 条
  • [1] GMDH-Based Semi-Supervised Feature Selection for Electricity Load Classification Forecasting
    Yang, Lintao
    Yang, Honggeng
    Liu, Haitao
    SUSTAINABILITY, 2018, 10 (01)
  • [2] GMDH-based feature ranking and selection for improved classification of medical data
    Abdel-Aal, RE
    JOURNAL OF BIOMEDICAL INFORMATICS, 2005, 38 (06) : 456 - 468
  • [3] Semi-supervised Feature Selection for Gender Classification
    Wu, Jing
    Smith, William A. P.
    Hancock, Edwin R.
    COMPUTER VISION - ACCV 2009, PT II, 2010, 5995 : 23 - 33
  • [4] Adaptive Feature Selection and Feature Fusion for Semi-supervised Classification
    Du, Wei
    Phlypo, Ronald
    Adali, Tulay
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2019, 91 (05): : 521 - 537
  • [5] Adaptive Feature Selection and Feature Fusion for Semi-supervised Classification
    Wei Du
    Ronald Phlypo
    Tülay Adalı
    Journal of Signal Processing Systems, 2019, 91 : 521 - 537
  • [6] Mass Classification in Mammogram with Semi-Supervised Relief Based Feature Selection
    Liu, Xiaoming
    Liu, Jun
    Feng, Zhilin
    Xu, Xin
    Tang, J.
    FIFTH INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2013), 2014, 9069
  • [7] A novel feature selection based semi-supervised method for image classification
    Tahir, M. A.
    Smith, J. E.
    Caleb-Solly, P.
    COMPUTER VISION SYSTEMS, PROCEEDINGS, 2008, 5008 : 484 - 493
  • [8] Semi-supervised local feature selection for data classification
    Zechao Li
    Jinhui Tang
    Science China Information Sciences, 2021, 64
  • [9] Semi-supervised local feature selection for data classification
    Zechao LI
    Jinhui TANG
    ScienceChina(InformationSciences), 2021, 64 (09) : 127 - 138
  • [10] Semi-supervised local feature selection for data classification
    Li, Zechao
    Tang, Jinhui
    SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (09)