Deep Neural Networks for Page Stream Segmentation and Classification

被引:0
|
作者
Gallo, Ignazio [1 ]
Noce, Lucia [1 ]
Zamberletti, Alessandro [1 ]
Calefati, Alessandro [1 ]
机构
[1] Univ Insubria, Dept Theoret & Appl Sci DiSTA, Via Mazzini 5, Varese, Italy
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this manuscript we propose a novel method for jointly page stream segmentation and multi-page document classification. The end goal is to classify a stream of pages as belonging to different classes of documents. We take advantage of the recent state-of-the-art results achieved using deep architectures in related fields such as document image classification, and we adopt similar models to obtain satisfying classification accuracies and a low computational complexity. Our contribution is twofold: first, the extraction of visual features from the processed documents is automatically performed by the chosen Convolutional Neural Network; second, the predictions of the same network are further refined using an additional deep model which processes them in a classic sliding-window manner to help finding and solving classification errors committed by the first network. The proposed pipeline has been evaluated on a publicly available dataset composed of more than half a million multi-page documents collected by an on-line loan comparison company, showing excellent results and high efficiency.
引用
收藏
页码:127 / 133
页数:7
相关论文
共 50 条
  • [31] Using Deep-Learned Vector Representations for Page Stream Segmentation by Agglomerative Clustering
    Busch, Lukas
    van Heusden, Ruben
    Marx, Maarten
    [J]. ALGORITHMS, 2023, 16 (05)
  • [32] Web page feature selection and classification using neural networks
    Selamat, A
    Omatu, S
    [J]. INFORMATION SCIENCES, 2004, 158 : 69 - 88
  • [33] Neural networks for web page classification based on augmented PCA
    Selamat, A
    Omatu, S
    [J]. PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 1792 - 1797
  • [34] Fully Convolutional Neural Networks for Page Segmentation of Historical Document Images
    Wick, Christoph
    Puppe, Frank
    [J]. 2018 13TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS), 2018, : 287 - 292
  • [35] Dual stream neural networks for brain signal classification
    Kuang, Dongyang
    Michoski, Craig
    [J]. JOURNAL OF NEURAL ENGINEERING, 2021, 18 (01)
  • [36] A Survey of Graphical Page Object Detection with Deep Neural Networks
    Bhatt, Jwalin
    Hashmi, Khurram Azeem
    Afzal, Muhammad Zeshan
    Stricker, Didier
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (12):
  • [37] Segmentation and classification of brain tumor using 3D-UNet deep neural networks
    Agrawal P.
    Katal N.
    Hooda N.
    [J]. International Journal of Cognitive Computing in Engineering, 2022, 3 : 199 - 210
  • [38] Hyperspectral classification via deep networks and superpixel segmentation
    Liu, Yazhou
    Cao, Guo
    Sun, Quansen
    Siegel, Mel
    [J]. INTERNATIONAL JOURNAL OF REMOTE SENSING, 2015, 36 (13) : 3459 - 3482
  • [39] Bronchus Segmentation and Classification by Neural Networks and Linear Programming
    Zhao, Tianyi
    Yin, Zhaozheng
    Wang, Jiao
    Gao, Dashan
    Chen, Yunqiang
    Mao, Yunxiang
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT VI, 2019, 11769 : 230 - 239
  • [40] Android applications classification with deep neural networks
    Mustapha Adamu Mohammed
    Michael Asante
    Seth Alornyo
    Bernard Obo Essah
    [J]. Iran Journal of Computer Science, 2023, 6 (3) : 221 - 232