Machine Learning for Intelligent Processing of Printed Documents

被引:0
|
作者
Floriana Esposito
Donato Malerba
Francesca A. Lisi
机构
[1] Università degli Studi di Bari,Dipartimento di Informatica
[2] Università degli Studi di Bari,Dipartimento di Informatica
[3] Università degli Studi di Bari,Dipartimento di Informatica
关键词
learning and knowledge discovery; intelligent information systems; intelligent document processing; decision-tree learning; first-order rule induction;
D O I
暂无
中图分类号
学科分类号
摘要
A paper document processing system is an information system component which transforms information on printed or handwritten documents into a computer-revisable form. In intelligent systems for paper document processing this information capture process is based on knowledge of the specific layout and logical structures of the documents. This article proposes the application of machine learning techniques to acquire the specific knowledge required by an intelligent document processing system, named WISDOM++, that manages printed documents, such as letters and journals. Knowledge is represented by means of decision trees and first-order rules automatically generated from a set of training documents. In particular, an incremental decision tree learning system is applied for the acquisition of decision trees used for the classification of segmented blocks, while a first-order learning system is applied for the induction of rules used for the layout-based classification and understanding of documents. Issues concerning the incremental induction of decision trees and the handling of both numeric and symbolic data in first-order rule learning are discussed, and the validity of the proposed solutions is empirically evaluated by processing a set of real printed documents.
引用
收藏
页码:175 / 198
页数:23
相关论文
共 50 条
  • [21] Big Data architecture for intelligent maintenance: a focus on query processing and machine learning algorithms
    Lehmann, Claude
    Huber, Lilach Goren
    Horisberger, Thomas
    Scheiba, Georg
    Sima, Ana Claudia
    Stockinger, Kurt
    [J]. JOURNAL OF BIG DATA, 2020, 7 (01)
  • [22] Big Data architecture for intelligent maintenance: a focus on query processing and machine learning algorithms
    Claude Lehmann
    Lilach Goren Huber
    Thomas Horisberger
    Georg Scheiba
    Ana Claudia Sima
    Kurt Stockinger
    [J]. Journal of Big Data, 7
  • [23] IterML: Iterative Machine Learning for Intelligent Parameter Pruning and Tuning in Graphics Processing Units
    Xuewen Cui
    Wu-chun Feng
    [J]. Journal of Signal Processing Systems, 2021, 93 : 391 - 403
  • [24] An Empirical Evaluation of Intelligent Machine Learning Algorithms under Big Data Processing Systems
    Suleiman, Dima
    Al-Zewairi, Malek
    Naymat, Ghazi
    [J]. 8TH INTERNATIONAL CONFERENCE ON EMERGING UBIQUITOUS SYSTEMS AND PERVASIVE NETWORKS (EUSPN 2017) / 7TH INTERNATIONAL CONFERENCE ON CURRENT AND FUTURE TRENDS OF INFORMATION AND COMMUNICATION TECHNOLOGIES IN HEALTHCARE (ICTH-2017) / AFFILIATED WORKSHOPS, 2017, 113 : 539 - 544
  • [25] IterML: Iterative Machine Learning for Intelligent Parameter Pruning and Tuning in Graphics Processing Units
    Cui, Xuewen
    Feng, Wu-chun
    [J]. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2021, 93 (04): : 391 - 403
  • [26] An Intelligent Approach for Detecting Palm Trees Diseases using Image Processing and Machine Learning
    Alaa, Hazem
    Waleed, Khaled
    Samir, Moataz
    Tarek, Mohamed
    Sobeah, Hager
    Salam, Mustafa Abdul
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (07) : 434 - 441
  • [27] Printed Circuit Board Defect Detection Methods Based on Image Processing, Machine Learning and Deep Learning: A Survey
    Ling, Qin
    Isa, Nor Ashidi Mat
    [J]. IEEE ACCESS, 2023, 11 : 15921 - 15944
  • [28] Machine Learning-Enabled Intelligent Gesture Recognition and Communication System Using Printed Strain Sensors
    Hu, Minglu
    He, Pei
    Zhao, Weikai
    Zeng, Xianghui
    He, Jiaorui
    Chen, Yucheng
    Xu, Xiaowen
    Sun, Jia
    Li, Zheling
    Yang, Junliang
    [J]. ACS APPLIED MATERIALS & INTERFACES, 2023, 15 (44) : 51360 - 51369
  • [29] Machine Learning for Intelligent Bioinformatics - Part 1 Machine Learning Integration
    Hamdi-Cherif, Aboubekeur
    [J]. PROCEEDINGS OF THE 9TH WSEAS INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING AND DATA BASES, 2010, : 315 - +
  • [30] Finding nuggets in documents: A machine learning approach
    Wu, YFB
    Li, QZ
    Bot, RS
    Chen, X
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2006, 57 (06): : 740 - 752