Knowledge discovery in sociological databases: An application on general society survey dataset

被引:1
|
作者
Pan Z. [1 ]
Li J. [2 ]
Chen Y. [1 ]
Pacheco J. [3 ]
Dai L. [4 ]
Zhang J. [4 ]
机构
[1] Institute of Computing Technology, Chinese Academy of Sciences, Beijing
[2] High School Affiliated to Renmin University of China, Beijing
[3] Universidad de Sonora, Hermosillo
[4] Information Centre of China Disabled Persons' Federation, Beijing
关键词
Crowdsourced big data and analytics; Data management; Data mining; Knowledge discovery;
D O I
10.1108/IJCS-09-2019-0023
中图分类号
学科分类号
摘要
Purpose: The General Society Survey(GSS) is a kind of government-funded survey which aims at examining the Socio-economic status, quality of life, and structure of contemporary society. GSS data set is regarded as one of the authoritative source for the government and organization practitioners to make data-driven policies. The previous analytic approaches for GSS data set are designed by combining expert knowledges and simple statistics. By utilizing the emerging data mining algorithms, we proposed a comprehensive data management and data mining approach for GSS data sets. Design/methodology/approach: The approach are designed to be operated in a two-phase manner: a data management phase which can improve the quality of GSS data by performing attribute pre-processing and filter-based attribute selection; a data mining phase which can extract hidden knowledge from the data set by performing data mining analysis including prediction analysis, classification analysis, association analysis and clustering analysis. Findings: According to experimental evaluation results, the paper have the following findings: Performing attribute selection on GSS data set can increase the performance of both classification analysis and clustering analysis; all the data mining analysis can effectively extract hidden knowledge from the GSS data set; the knowledge generated by different data mining analysis can somehow cross-validate each other. Originality/value: By leveraging the power of data mining techniques, the proposed approach can explore knowledge in a fine-grained manner with minimum human interference. Experiments on Chinese General Social Survey data set are conducted at the end to evaluate the performance of our approach. © 2019, Zhiwen Pan, Jiangtian Li, Yiqiang Chen, Jesus Pacheco, Lianjun Dai and Jun Zhang.
引用
收藏
页码:315 / 332
页数:17
相关论文
共 50 条
  • [41] Data mining and knowledge discovery in databases
    Fayyad, U
    Uthurusamy, R
    COMMUNICATIONS OF THE ACM, 1996, 39 (11) : 24 - 26
  • [42] Knowledge discovery from industrial databases
    Gertosio, C
    Dussauchoy, A
    JOURNAL OF INTELLIGENT MANUFACTURING, 2004, 15 (01) : 29 - 37
  • [43] KNOWLEDGE DISCOVERY IN DATABASES - PROGRESS REPORT
    PIATETSKYSHAPIRO, G
    KNOWLEDGE ENGINEERING REVIEW, 1994, 9 (01): : 57 - 60
  • [44] Comprehensible knowledge-discovery in databases
    Pazzani, MJ
    Mani, S
    Shankle, WR
    PROCEEDINGS OF THE NINETEENTH ANNUAL CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY, 1997, : 596 - 601
  • [45] Problem of knowledge discovery in noisy databases
    Vagin, Vadim
    Fomina, Marina
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2011, 2 (03) : 135 - 145
  • [46] A virtual mart for knowledge discovery in databases
    Diamantini, Claudia
    Potena, Domenico
    Storti, Emanuele
    INFORMATION SYSTEMS FRONTIERS, 2013, 15 (03) : 447 - 463
  • [47] Knowledge discovery from industrial databases
    Christine Gertosio
    Alan Dussauchoy
    Journal of Intelligent Manufacturing, 2004, 15 : 29 - 37
  • [48] Knowledge discovery in multiple spatial Databases
    Lazarevic, A
    Obradovic, Z
    NEURAL COMPUTING & APPLICATIONS, 2002, 10 (04): : 339 - 350
  • [49] Discovery of knowledge from diagnostic databases
    Moczulski, WA
    Kostka, P
    DATA MINING AND KNOWLEDGE DISCOVERY: THEORY, TOOLS AND TECHNOLOGY IV, 2002, 4730 : 126 - 137
  • [50] Knowledge discovery in bibliographic databases - Introduction
    Qin, J
    Norton, MJ
    LIBRARY TRENDS, 1999, 48 (01) : 1 - 8