DOCUMENT BINARIZATION WITH MULTI-BRANCH GATED CONVOLUTIONAL GENERATIVE ADVERSARIAL NETWORKS

被引:0
|
作者
Yang, Zongyuan [1 ]
Xiong, Yongping [1 ]
Wu, Guibin [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing 100876, Peoples R China
关键词
Document binarization; Gated convolution; Multi-scale fusion; Adversarial learning; DIBCO; COMPETITION;
D O I
10.1109/ICIP49359.2023.10222024
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing document binarization methods can not extract stroke edges finely, mainly due to the fair-treatment nature of vanilla convolutions and the extraction of stroke edges without adequate supervision by boundary-related information. In this paper, we formulate text extraction as the learning of gating values and propose a novel end-to-end gated convolutions-based network (GDB) to solve the problem of imprecise stroke edge extraction. The gated convolutions are applied to selectively extract the features of strokes with different attention. Firstly, a coarse sub-network with an extra edge branch is trained to get more precise feature maps by feeding a priori mask and edge. Secondly, a refinement sub-network is cascaded to refine the output of the first stage by gated convolutions based on the sharp edge. For global information, GDB also contains a multi-scale operation to combine local and global features. Experimental results show that our proposed methods outperform the SOTA methods in terms of all metrics on average over all DIBCO datasets from 2009 to 2019 and achieve top ranking on six benchmark datasets. Available codes: https://github.com/Royalvice/GDB.
引用
收藏
页码:680 / 684
页数:5
相关论文
共 50 条
  • [41] Multi-Branch Configuration of Dynamic Wireless Cooperative Networks
    Vazifehdan, J.
    Shafiee, H.
    ICSPC: 2007 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS, VOLS 1-3, PROCEEDINGS, 2007, : 1395 - +
  • [42] Multi-view Generative Adversarial Networks
    Chen, Mickael
    Denoyer, Ludovic
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2017, PT II, 2017, 10535 : 175 - 188
  • [43] A Modularized Architecture of Multi-Branch Convolutional Neural Network for Image Captioning
    He, Shan
    Lu, Yuanyao
    ELECTRONICS, 2019, 8 (12)
  • [44] Spectrum sensing based on deep convolutional generative adversarial networks
    Liu, Zheng
    Jing, Xiaojun
    Zhang, Ronghui
    Mu, Junsheng
    IWCMC 2021: 2021 17TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE (IWCMC), 2021, : 796 - 801
  • [45] Up and Down Residual Blocks for Convolutional Generative Adversarial Networks
    Wang, Yueyue
    Guo, Xinchang
    Liu, Peng
    Wei, Bin
    IEEE ACCESS, 2021, 9 : 26051 - 26058
  • [46] Generating Traffic Scene with Deep Convolutional Generative Adversarial Networks
    Zhao, Danchen
    Weng, Jingkun
    Liu, Yuehu
    2017 CHINESE AUTOMATION CONGRESS (CAC), 2017, : 6612 - 6617
  • [47] Generative Adversarial Graph Convolutional Networks for Human Action Synthesis
    Degardin, Bruno
    Neves, Joao
    Lopes, Vasco
    Brito, Joao
    Yaghoubi, Ehsan
    Proenca, Hugo
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2753 - 2762
  • [48] Multi-branch convolutional neural network for multiple sclerosis lesion segmentation
    Aslani, Shahab
    Dayan, Michael
    Storelli, Loredana
    Filippi, Massimo
    Murino, Vittorio
    Rocca, Maria A.
    Sona, Diego
    NEUROIMAGE, 2019, 196 : 1 - 15
  • [49] Exploring the Role of Recursive Convolutional Layer in Generative Adversarial Networks
    Corradini, Barbara Toniella
    Andreini, Paolo
    Hagenbuchner, Markus
    Scarselli, Franco
    Tsoi, Ah Chung
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT V, 2023, 14258 : 53 - 64
  • [50] A multi-branch convolutional neural network with density map for aphid counting
    Li, Rui
    Wang, Rujing
    Xie, Chengjun
    Chen, Hongbo
    Long, Qi
    Liu, Liu
    Zhang, Jie
    Chen, Tianjiao
    Hu, Haiying
    Jiao, Lin
    Du, Jianming
    Liu, Haiyun
    BIOSYSTEMS ENGINEERING, 2022, 213 : 148 - 161