Leveraging effectiveness and efficiency in Page Stream Deep Segmentation

被引:6
|
作者
Braz, Fabricio Ataides [1 ]
Silva, Nilton Correia da [1 ]
Lima, Jonathan Alis Salgado [1 ]
机构
[1] Univ Brasilia, Gama Coll, AI Lab, Brasilia, DF, Brazil
关键词
Page Stream Segmentation; Classification;
D O I
10.1016/j.engappai.2021.104394
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The separation of documents contained in a page stream is a critical activity in some segments. That is the case of the Brazilian judiciary system since it is overwhelmed with files resulting from batch scanning of lawsuits, finishing in PDFs containing several types of mixed documents. To make such a file usable, we must divide it into cohesive sets of pages that result in a single piece. The typical approach to this task involves sorting the page into a stream that reveals the transition between documents. That is, it is about identifying the page that highlights a new piece in the stream. For this task, classification methods combining text and image got the best results, obtaining accuracy and kappa scores respectively of 91.9% and 83.1% in the Tobacoo800 dataset. This outcome, although remarkable, requires excessive computational demand. In this work, by changing the entry of image models and employing a novel labeling system, we achieved the same result, without the overhead that the modal text imposes on the solution. In addition, we built a new public dataset called AI.Lab.Splitter specifically aimed at page stream segmentation task with more than 30k labeled samples. Finally, in addition to VGG, we used EfficientNet, whose number of parameters is 1/6 of the former. We could observe an advantage close to 2.5% in f1 score, compared to the same proposal using VGG in our dataset.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Deep Neural Networks for Page Stream Segmentation and Classification
    Gallo, Ignazio
    Noce, Lucia
    Zamberletti, Alessandro
    Calefati, Alessandro
    [J]. 2016 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2016, : 127 - 133
  • [2] Using Deep-Learned Vector Representations for Page Stream Segmentation by Agglomerative Clustering
    Busch, Lukas
    van Heusden, Ruben
    Marx, Maarten
    [J]. ALGORITHMS, 2023, 16 (05)
  • [3] WooIR: A New Open Page Stream Segmentation Dataset
    van Heusden, Ruben
    Kamps, Jaap
    Marx, Maarten
    [J]. PROCEEDINGS OF THE 2022 ACM SIGIR INTERNATIONAL CONFERENCE ON THE THEORY OF INFORMATION RETRIEVAL, ICTIR 2022, 2022, : 165 - 174
  • [4] Video stream segmentation method based on video page
    Zhu, Miao-Liang
    Wang, Dong-Hui
    [J]. Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design & Computer Graphics, 2000, 12 (08): : 585 - 589
  • [5] Unsupervised Deep Learning for Handwritten Page Segmentation
    Droby, Ahmad
    Barakat, Berat Kurar
    Madi, Borak
    Alaasam, Reem
    El-Sana, Jihad
    [J]. 2020 17TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2020), 2020, : 240 - 245
  • [6] Document Classification and Page Stream Segmentation for Digital Mailroom Applications
    Gordo, Albert
    Al Rusinol, Marcal
    Karatzas, Dimosthenis
    Bagdanov, Andrew D.
    [J]. 2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 621 - 625
  • [7] Multi-modal page stream segmentation with convolutional neural networks
    Wiedemann, Gregor
    Heyer, Gerhard
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2021, 55 (01) : 127 - 150
  • [8] Multi-modal page stream segmentation with convolutional neural networks
    Gregor Wiedemann
    Gerhard Heyer
    [J]. Language Resources and Evaluation, 2021, 55 : 127 - 150
  • [9] Leveraging Semantic Links for High Efficiency Page-Level FTL Design
    Zhou, Jian
    Chen, Xunchao
    Wang, Jun
    Wu, Fei
    Zhou, You
    Xie, Changsheng
    [J]. 2016 IEEE 36TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS WORKSHOPS (ICDCSW 2016), 2016, : 84 - 89
  • [10] Page Stream Segmentation with Convolutional Neural Nets Combining Textual and Visual Features
    Wiedemann, Gregor
    Heyer, Gerhard
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 3675 - 3680