A Two-Step Resume Information Extraction Algorithm

被引:14
|
作者
Chen, Jie [1 ]
Zhang, Chunxia [2 ]
Niu, Zhendong [1 ,3 ,4 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing 100081, Peoples R China
[2] Beijing Inst Technol, Sch Software, Beijing 100081, Peoples R China
[3] Beijing Inst Technol, Beijing Engn Res Ctr Mass Language Informat Proc, Beijing 100081, Peoples R China
[4] Univ Pittsburgh, Sch Comp & Informat, Pittsburgh, PA 15260 USA
基金
中国国家自然科学基金;
关键词
D O I
10.1155/2018/5761287
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
With the rapid growth of Internet-based recruiting, there are a great number of personal resumes among recruiting systems. To gain more attention from the recruiters, most resumes are written in diverse formats, including varying font size, font colour, and table cells. However, the diversity of format is harmful to data mining, such as resume information extraction, automatic job matching, and candidates ranking. Supervised methods and rule-based methods have been proposed to extract facts from resumes, but they strongly rely on hierarchical structure information and large amounts of labelled data, which are hard to collect in reality. In this paper, we propose a two-step resume information extraction approach. In the first step, raw text of resume is identified as different resume blocks. To achieve the goal, we design a novel feature, Writing Style, to model sentence syntax information. Besides word index and punctuation index, word lexical attribute and prediction results of classifiers are included in Writing Style. In the second step, multiple classifiers are employed to identify different attributes of fact information in resumes. Experimental results on a real-world dataset show that the algorithm is feasible and effective.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Two-step interferometry by a regularized optical flow algorithm
    Vargas, J.
    Antonio Quiroga, J.
    Sorzano, C. O. S.
    Estrada, J. C.
    Carazo, J. M.
    OPTICS LETTERS, 2011, 36 (17) : 3485 - 3487
  • [22] A fast two-step algorithm for invasion percolation with trapping
    Masson, Yder
    COMPUTERS & GEOSCIENCES, 2016, 90 : 41 - 48
  • [23] A two-step artificial bee colony algorithm for clustering
    Yugal kumar
    G. Sahoo
    Neural Computing and Applications, 2017, 28 : 537 - 551
  • [24] A two-step algorithm for learning from unspecific reinforcement
    Kuhn, R
    Stamatescu, IO
    JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL, 1999, 32 (31): : 5749 - 5762
  • [25] Two-step pruning: A distributed query optimization algorithm
    Kim, H
    Lee, SH
    Kim, HJ
    ADVANCES IN DATABASES, 1995, 940 : 183 - 203
  • [26] An improved two-step temporal phase unwrapping algorithm
    Du, Guang-Liang
    Zhang, Chao-Rui
    Zhou, Can-Lin
    Si, Shu-Chun
    Li, Hui
    Lei, Zhen-Kun
    Guangdianzi Jiguang/Journal of Optoelectronics Laser, 2015, 26 (11): : 2187 - 2192
  • [27] A Two-Step User Selection Algorithm for Multiuser Precoding
    Dong, Chao
    Wang, Youzheng
    Lu, Jianhua
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2014, 63 (04) : 1922 - 1927
  • [28] An efficient two-step algorithm for the incompressible flow problem
    Huang, Pengzhan
    Feng, Xinlong
    He, Yinnian
    ADVANCES IN COMPUTATIONAL MATHEMATICS, 2015, 41 (06) : 1059 - 1077
  • [29] Two-Step Spectral Clustering Controlled Islanding Algorithm
    Ding, Lei
    Gonzalez-Longatt, Francisco M.
    Wall, Peter
    Terzija, Vladimir
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2013, 28 (01) : 75 - 84
  • [30] Two-step Multiset Regression Analysis (MsRA) Algorithm
    Zhao, Chunhui
    Gao, Furong
    INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2012, 51 (03) : 1337 - 1354