EMR: Scalable Clustering of Big HR Data using Evolutionary MapReduce

被引:0
|
作者
Bohlouli, Mahdi [1 ,2 ,3 ]
He, Zhonghua [4 ]
机构
[1] Inst Adv Studies Basic Sci, Dept Comp Sci & Informat Technol, Zanjan, Iran
[2] Inst Adv Studies Basic Sci, Res Ctr Basic Sci & Modern Technol RBST, Zanjan, Iran
[3] Petanux GmbH, Res & Innovat Dept, Bonn, Germany
[4] Univ Siegen, Siegen, Germany
关键词
Big Data; K-Means Clustering; Evolutionary Algorithms; Scalable Clustering; Large Scale Human Resource Data; JOB KNOWLEDGE; MAP-REDUCE; SYSTEM; ABILITY;
D O I
10.1145/3442442.3453543
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays, the volume and variety of generated data, how to process it and accordingly create value through scalable analytics are main challenges to industries and real-world practices such as talent analytics. For instance, large enterprises and job centres have to progress data intensive matching of job seekers to various job positions at the same time. In other words, it should result in the large scale assignment of best-fit (right) talents (Person) with right expertise (Profession) to the right job (Position) at the right time (Period). We call this definition as a 4P rule in this paper. All enterprises should consider 4P rule in their daily recruitment processes towards efficient workforce development strategies. Such consideration demands integrating large volumes of disparate data from various sources and strongly needs the use of scalable algorithms and analytics. The diversity of the data in human resource management requires speeding up analytical processes. The main challenge here is not only how and where to store the data, but also the analysing it towards creating value (knowledge discovery). In this paper, we propose a generic Career Knowledge Representation (CKR) model in order to be able to model most competences that exist in a wide variety of careers. A regenerated job qualification data of 15 million employees with 84 dimensions (competences) from real HRM data has been used in test and evaluation of proposed Evolutionary MapReduce K-Means method in this research. This proposed EMR method shows faster and more accurate experimental results in comparison to similar approaches and has been tested with real large scale datasets and achieved results are already discussed.
引用
收藏
页码:26 / 34
页数:9
相关论文
共 50 条
  • [1] Clustering on Big Data Using Hadoop MapReduce
    Akthar, Nadeem
    Ahamad, Mohd Vasim
    Khan, Shahbaz
    [J]. 2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 789 - 795
  • [2] Hierarchical PSO Clustering on MapReduce for Scalable Privacy Preservation in Big Data
    Wai, Ei Nyein Chan
    Tsai, Pei-Wei
    Pan, Jeng-Shyang
    [J]. GENETIC AND EVOLUTIONARY COMPUTING, 2017, 536 : 36 - 44
  • [3] MapReduce Clustering for Big Data
    Ghattas, Badih
    Pinto, Antoine
    Diao, Sambou
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5116 - 5124
  • [4] Event Segmentation using MapReduce based Big Data Clustering
    Shafiq, M. Omair
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 1857 - 1866
  • [5] Improved CURE Clustering for Big Data using Hadoop and Mapreduce
    Lathiya, Piyush
    Rani, Rinkle
    [J]. 2016 INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT), VOL 3, 2015, : 241 - 245
  • [6] Optimized big data K-means clustering using MapReduce
    Cui, Xiaoli
    Zhu, Pingfei
    Yang, Xin
    Li, Keqiu
    Ji, Changqing
    [J]. JOURNAL OF SUPERCOMPUTING, 2014, 70 (03): : 1249 - 1259
  • [7] Optimized big data K-means clustering using MapReduce
    Xiaoli Cui
    Pingfei Zhu
    Xin Yang
    Keqiu Li
    Changqing Ji
    [J]. The Journal of Supercomputing, 2014, 70 : 1249 - 1259
  • [8] MapReduce based Method for Big Data Semantic Clustering
    Yang, Jie
    Li, Xiaoping
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 2814 - 2819
  • [9] Big data clustering with varied density based on MapReduce
    Safanaz Heidari
    Mahmood Alborzi
    Reza Radfar
    Mohammad Ali Afsharkazemi
    Ali Rajabzadeh Ghatari
    [J]. Journal of Big Data, 6
  • [10] Big data clustering with varied density based on MapReduce
    Heidari, Safanaz
    Alborzi, Mahmood
    Radfar, Reza
    Afsharkazemi, Mohammad Ali
    Ghatari, Ali Rajabzadeh
    [J]. JOURNAL OF BIG DATA, 2019, 6 (01)