Hadoop Recognition of Biomedical Named Entity Using Conditional Random Fields

被引:55
|
作者
Li, Kenli [1 ,3 ]
Ai, Wei [1 ]
Tang, Zhuo [1 ]
Zhang, Fan [2 ]
Jiang, Lingang [1 ]
Li, Keqin [4 ]
Hwang, Kai [5 ]
机构
[1] Hunan Univ, Coll Informat Sci & Engn, Changsha 410082, Hunan, Peoples R China
[2] MIT, Kavli Inst Astrophys & Space Res, Cambridge, MA 02139 USA
[3] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY 12561 USA
[4] Hunan Univ, Coll Informat Sci & Engn, Changsha 410082, Hunan, Peoples R China
[5] Univ So Calif, Dept Elect Engn, Los Angeles, CA 90089 USA
基金
中国国家自然科学基金;
关键词
Biomedical named entity recognition; conditional random fields; MapReduce; parallel algorithm; FEATURES;
D O I
10.1109/TPDS.2014.2368568
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Processing large volumes of data has presented a challenging issue, particularly in data-redundant systems. As one of the most recognized models, the conditional random fields (CRF) model has been widely applied in biomedical named entity recognition (Bio-NER). Due to the internally sequential feature, performance improvement of the CRF model is nontrivial, which requires new parallelized solutions. By combining and parallelizing the limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) and Viterbi algorithms, we propose a parallel CRF algorithm called MapReduce CRF (MRCRF) in this paper, which contains two parallel sub-algorithms to handle two time-consuming steps of the CRF model. The MapReduce L-BFGS (MRLB) algorithm leverages the MapReduce framework to enhance the capability of estimating parameters. Furthermore, the MapReduce Viterbi (MRVtb) algorithm infers the most likely state sequence by extending the Viterbi algorithm with another MapReduce job. Experimental results show that the MRCRF algorithm outperforms other competing methods by exhibiting significant performance improvement in terms of time efficiency as well as preserving a guaranteed level of correctness.
引用
收藏
页码:3040 / 3051
页数:12
相关论文
共 50 条
  • [1] BIOMEDICAL NAMED ENTITY RECOGNITION USING SECONDORDER CONDITIONAL RANDOM FIELDS
    Thipcharoen, Supattanawaree
    Subpaiboonkit, Sitthichoke
    Chaijaruwanich, Jeerayut
    [J]. 2011 3RD INTERNATIONAL CONFERENCE ON COMPUTER TECHNOLOGY AND DEVELOPMENT (ICCTD 2011), VOL 2, 2012, : 397 - 401
  • [2] A CONDITIONAL RANDOM FIELDS APPROACH TO BIOMEDICAL NAMED ENTITY RECOGNITION
    Wang Haochang Zhao Tiejun Li Sheng Yu Hao (School of Computer Science and Technology
    [J]. Journal of Electronics(China), 2007, (06) : 838 - 844
  • [3] Named Entity Recognition using Conditional Random Fields
    Patil, Nita
    Patil, Ajay
    Pawar, B., V
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND DATA SCIENCE, 2020, 167 : 1181 - 1188
  • [4] Named Entity Recognition Using Conditional Random Fields
    Khan, Wahab
    Daud, Ali
    Shahzad, Khurram
    Amjad, Tehmina
    Banjar, Ameen
    Fasihuddin, Heba
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (13):
  • [5] A tool for the named entity recognition using conditional random fields
    do Amaral, Daniela Oliveira F.
    Vieira, Renata
    [J]. LINGUAMATICA, 2014, 6 (01): : 41 - 49
  • [6] A Malay Named Entity Recognition Using Conditional Random Fields
    Salleh, Muhammad Sharilazlan
    Asmai, Siti Azirah
    Basiron, Halizah
    Ahmad, Sabrina
    [J]. 2017 5TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (ICOIC7), 2017,
  • [7] Named entity recognition based on conditional random fields
    Song, Shengli
    Zhang, Nan
    Huang, Haitao
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 3): : S5195 - S5206
  • [8] Iterative Named Entity Recognition with Conditional Random Fields
    Alves-Pinto, Ana
    Demus, Christoph
    Spranger, Michael
    Labudde, Dirk
    Hobley, Eleanor
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (01):
  • [9] Named entity recognition based on conditional random fields
    Shengli Song
    Nan Zhang
    Haitao Huang
    [J]. Cluster Computing, 2019, 22 : 5195 - 5206
  • [10] Kannada Named Entity Recognition and classification using Conditional Random Fields
    Amarappa, S.
    Sathyanarayana, S. V.
    [J]. 2015 INTERNATIONAL CONFERENCE ON EMERGING RESEARCH IN ELECTRONICS, COMPUTER SCIENCE AND TECHNOLOGY (ICERECT), 2015, : 186 - 191