Image Text Extraction and Natural Language Processing of Unstructured Data from Medical Reports

被引:0
|
作者
Malashin, Ivan [1 ]
Masich, Igor [1 ]
Tynchenko, Vadim [1 ]
Gantimurov, Andrei [1 ]
Nelyub, Vladimir [1 ,2 ]
Borodulin, Aleksei [1 ]
机构
[1] Bauman Moscow State Tech Univ, Artificial Intelligence Technol Sci & Educ Ctr, Moscow 105005, Russia
[2] Far Eastern Fed Univ, Sci Dept, Vladivostok 690922, Russia
来源
关键词
image recognition; natural language processing; named entity recognition; information extraction; CONVOLUTIONAL NEURAL-NETWORKS; INFORMATION EXTRACTION; RECOGNITION; VIDEO;
D O I
10.3390/make6020064
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This study presents an integrated approach for automatically extracting and structuring information from medical reports, captured as scanned documents or photographs, through a combination of image recognition and natural language processing (NLP) techniques like named entity recognition (NER). The primary aim was to develop an adaptive model for efficient text extraction from medical report images. This involved utilizing a genetic algorithm (GA) to fine-tune optical character recognition (OCR) hyperparameters, ensuring maximal text extraction length, followed by NER processing to categorize the extracted information into required entities, adjusting parameters if entities were not correctly extracted based on manual annotations. Despite the diverse formats of medical report images in the dataset, all in Russian, this serves as a conceptual example of information extraction (IE) that can be easily extended to other languages.
引用
收藏
页码:1361 / 1377
页数:17
相关论文
共 50 条
  • [1] Fiscal data in text: Information extraction from audit reports using Natural Language Processing
    Beltran, Alejandro
    [J]. DATA & POLICY, 2023, 5
  • [2] Automatic Extraction of Engineering Rules From Unstructured Text: A Natural Language Processing Approach
    Ye, Xinfeng
    Lu, Yuqian
    [J]. JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING, 2020, 20 (03)
  • [3] Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study
    Yu, Amy Y. X.
    Liu, Zhongyu A.
    Pou-Prom, Chloe
    Lopes, Kaitlyn
    Kapral, Moira K.
    Aviv, Richard, I
    Mamdani, Muhammad
    [J]. JMIR MEDICAL INFORMATICS, 2021, 9 (05)
  • [4] A Natural Language Processing Tool for Large-Scale Data Extraction from Echocardiography Reports
    Nath, Chinmoy
    Albaghdadi, Mazen S.
    Jonnalagadda, Siddhartha R.
    [J]. PLOS ONE, 2016, 11 (04):
  • [5] Natural Language Processing of Radiology Text Reports: Interactive Text Classification
    Wiggins, Walter F.
    Kitamura, Felipe
    Santos, Igor
    Prevedello, Luciano M.
    [J]. RADIOLOGY-ARTIFICIAL INTELLIGENCE, 2021, 3 (04)
  • [6] Anatomic stage extraction from medical reports of breast Cancer patients using natural language processing
    Deshmukh, Pratiksha R.
    Phalnikar, Rashmi
    [J]. HEALTH AND TECHNOLOGY, 2020, 10 (06) : 1555 - 1570
  • [7] Anatomic stage extraction from medical reports of breast Cancer patients using natural language processing
    Pratiksha R. Deshmukh
    Rashmi Phalnikar
    [J]. Health and Technology, 2020, 10 : 1555 - 1570
  • [8] The Effect of Natural Language Processing on the Analysis of Unstructured Text: A Systematic Review
    Roldan-Baluis, Walter Luis
    Zapata, Noel Alcas
    Vasquez, Maria Soledad Manaccasa
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (05) : 43 - 51
  • [9] Natural language processing in mining unstructured data from software repositories: a review
    Gupta, Som
    Gupta, S. K.
    [J]. SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2019, 44 (12):
  • [10] Natural language processing in mining unstructured data from software repositories: a review
    Som Gupta
    S K Gupta
    [J]. Sādhanā, 2019, 44