A Two-Step Resume Information Extraction Algorithm

被引:14
|
作者
Chen, Jie [1 ]
Zhang, Chunxia [2 ]
Niu, Zhendong [1 ,3 ,4 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing 100081, Peoples R China
[2] Beijing Inst Technol, Sch Software, Beijing 100081, Peoples R China
[3] Beijing Inst Technol, Beijing Engn Res Ctr Mass Language Informat Proc, Beijing 100081, Peoples R China
[4] Univ Pittsburgh, Sch Comp & Informat, Pittsburgh, PA 15260 USA
基金
中国国家自然科学基金;
关键词
D O I
10.1155/2018/5761287
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
With the rapid growth of Internet-based recruiting, there are a great number of personal resumes among recruiting systems. To gain more attention from the recruiters, most resumes are written in diverse formats, including varying font size, font colour, and table cells. However, the diversity of format is harmful to data mining, such as resume information extraction, automatic job matching, and candidates ranking. Supervised methods and rule-based methods have been proposed to extract facts from resumes, but they strongly rely on hierarchical structure information and large amounts of labelled data, which are hard to collect in reality. In this paper, we propose a two-step resume information extraction approach. In the first step, raw text of resume is identified as different resume blocks. To achieve the goal, we design a novel feature, Writing Style, to model sentence syntax information. Besides word index and punctuation index, word lexical attribute and prediction results of classifiers are included in Writing Style. In the second step, multiple classifiers are employed to identify different attributes of fact information in resumes. Experimental results on a real-world dataset show that the algorithm is feasible and effective.
引用
下载
收藏
页数:8
相关论文
共 50 条
  • [1] A Two-Step Blind Extraction Algorithm of Underdetermined Speech Mixtures
    Xiao, Ming
    Wang, Fuquan
    Xiong, Jianping
    ADVANCES IN NEURAL NETWORKS - ISNN 2008, PT 2, PROCEEDINGS, 2008, 5264 : 757 - 763
  • [2] A Two-Step Simulated Annealing Algorithm for Spectral Data Feature Extraction
    Pei, Jian
    Xu, Liang
    Huang, Yitong
    Jiao, Qingbin
    Yang, Mingyu
    Ma, Ding
    Jiang, Sijia
    Li, Hui
    Li, Yuhang
    Liu, Siqi
    Zhang, Wei
    Zhang, Jiahang
    Tan, Xin
    SENSORS, 2023, 23 (02)
  • [3] Two-step iterative algorithm immune to tilt shifts for phase extraction
    Duan, Mingliang
    Zong, Yi
    Yu, Caiyun
    Li, Jianxin
    OPTICAL MEASUREMENT SYSTEMS FOR INDUSTRIAL INSPECTION XII, 2021, 11782
  • [4] Restricted information in a two-step cascade
    Nandi, Mintu
    Banik, Suman K.
    Chaudhury, Pinaki
    PHYSICAL REVIEW E, 2019, 100 (03):
  • [5] A Two-Step Iris Recognition Algorithm
    Liu Jin
    Liu Cangming
    Liu Tingting
    APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 559 - 562
  • [6] Two-Step Gravitational Search Algorithm
    Chiang, Tsang-Ying
    Feng, Ting-Cheng
    Li, Tzu-Hseng S.
    2015 INTERNATIONAL CONFERENCE ON INFORMATIVE AND CYBERNETICS FOR COMPUTATIONAL SOCIAL SYSTEMS (ICCSS), 2015, : 95 - 98
  • [7] Two-step Algorithm for Image Inpainting
    Jurio, Aranzazu
    Paternain, Daniel
    Pagola, Miguel
    Marco-Detchart, Cedric
    Bustince, Humberto
    ADVANCES IN FUZZY LOGIC AND TECHNOLOGY 2017, VOL 2, 2018, 642 : 302 - 313
  • [8] Improvement of the two-step Richardson algorithm
    Buledza, AV
    CYBERNETICS AND SYSTEMS ANALYSIS, 1995, 31 (06) : 927 - 930
  • [9] A two-step adaptive identification algorithm
    Lototskii, VA
    Rurua, AA
    Chadeev, VM
    AUTOMATION AND REMOTE CONTROL, 2000, 61 (08) : 1284 - 1288
  • [10] McTwo: a two-step feature selection algorithm based on maximal information coefficient
    Ge, Ruiquan
    Zhou, Manli
    Luo, Youxi
    Meng, Qinghan
    Mai, Guoqin
    Ma, Dongli
    Wang, Guoqing
    Zhou, Fengfeng
    BMC BIOINFORMATICS, 2016, 17