State-of-the-art anonymization of medical records using an iterative machine learning framework

被引:87
|
作者
Szarvas, Gyoercy
Farkas, Richard
Busa-Fekete, Robert
机构
[1] Univ Szeged, Dept Informat, Szeged, Hungary
[2] Univ Szeged, Szeged, Hungary
[3] Artificial Intelligence Hungarian Acad Sci, Szeged, Hungary
关键词
D O I
10.1197/j.jamia.M2441
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: The anonymization of medical records is of great importance in the human life sciences because a de-identified text can be made publicly available for non-hospital researchers as well, to facilitate research on human diseases. Here the authors have developed a de-identification model that can successfully remove personal health information (PHI) from discharge records to make them conform to the guidelines of the Health Information Portability and Accountability Act. Design: We introduce here a novel, machine learning-based iterative Named Entity Recognition approach intended for use on semi-structured documents like discharge records. Our method identifies PHI in several steps. First, it labels all entities whose tags can be inferred from the structure of the text and it then utilizes this information to find further PHI phrases in the flow text parts of the document. Measurements: Following the standard evaluation method of the first Workshop on Challenges in Natural Language Processing for Clinical Data, we used token-level Precision, Recall and F-beta=1 measure metrics for evaluation. Results: Our system achieved outstanding accuracy on the standard evaluation dataset of the de-identification challenge, with an F measure of 99.7534% for the best submitted model. Conclusion: We can say that our system is competitive with the current state-of-the-art solutions, while we describe here several techniques that can be beneficial in other tasks that need to handle structured documents such as clinical records.
引用
收藏
页码:574 / 580
页数:7
相关论文
共 50 条
  • [1] State-of-the-art Anonymization of Medical Records Using an Iterative Machine Learning Framework (vol 14, pg 574, 2007)
    Szarvas, G.
    Farkas, R.
    Busa-Fekete, R.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2009, 16 (03) : 284 - 284
  • [2] Machine learning in medical applications: A review of state-of-the-art methods
    Shehab, Mohammad
    Abualigah, Laith
    Shambour, Qusai
    Abu-Hashem, Muhannad A.
    Shambour, Mohd Khaled Yousef
    Alsalibi, Ahmed Izzat
    Gandomi, Amir H.
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 145
  • [3] A state-of-the-art universal machine learning framework for decoding suspect coded messages
    Hussain, Syed
    Mohideen S, Pakkir
    Measurement: Sensors, 2024, 33
  • [4] Landslide susceptibility mapping using state-of-the-art machine learning ensembles
    Pham, Binh Thai
    Vu, Vinh Duy
    Costache, Romulus
    Phong, Tran Van
    Ngo, Trinh Quoc
    Tran, Trung-Hieu
    Nguyen, Huu Duy
    Amiri, Mahdis
    Tan, Mai Thanh
    Trinh, Phan Trong
    Le, Hiep Van
    Prakash, Indra
    GEOCARTO INTERNATIONAL, 2022, 37 (18) : 5175 - 5200
  • [5] Machine Learning in Healthcare Analytics: A State-of-the-Art Review
    Das, Surajit
    Nayak, Samaleswari P.
    Sahoo, Biswajit
    Nayak, Sarat Chandra
    ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING, 2024, 31 (07) : 3923 - 3962
  • [6] Machine learning in additive manufacturing: State-of-the-art and perspectives
    Wang, C.
    Tan, X. P.
    Tor, S. B.
    Lim, C. S.
    ADDITIVE MANUFACTURING, 2020, 36
  • [7] Automated Machine Learning: State-of-The-Art and Open Challenges
    Elshawi, Radwa
    Sakr, Sherif
    RESEARCH CHALLENGES IN INFORMATION SCIENCE (RCIS 2020), 2020, 385 : 627 - 629
  • [8] Machine Learning in Petrology: State-of-the-Art and Future Perspectives
    Petrelli, Maurizio
    JOURNAL OF PETROLOGY, 2024, 65 (05)
  • [9] Machine learning for structural engineering: A state-of-the-art review
    Thai, Huu-Tai
    STRUCTURES, 2022, 38 : 448 - 491
  • [10] Hydrocarbon production dynamics forecasting using machine learning: A state-of-the-art review
    Liang, Bin
    Liu, Jiang
    You, Junyu
    Jia, Jin
    Pan, Yi
    Jeong, Hoonyoung
    FUEL, 2023, 337