Dual-branch dilated context convolutional for table detection transformer in the document images

被引:1
|
作者
Ni, Ying [1 ]
Wang, Xiaoli [1 ]
Peng, Hanghang [1 ]
Li, Yonzhi [2 ,3 ]
Wang, Jinyang [2 ]
Li, Haoxuan [2 ]
Huang, Jin [2 ,3 ]
机构
[1] Guotai Asset Management Co Ltd, Shanghai, Peoples R China
[2] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[3] State Key Lab New Text Mat & Adv Proc Technol, Wuhan, Peoples R China
来源
VISUAL COMPUTER | 2025年 / 41卷 / 04期
关键词
Table detection; Transformer; Dilated convolution; Visualization;
D O I
10.1007/s00371-024-03561-6
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
With the increasing automation of document images like financial reports, table detection has become a critical component of document automation. It requires models to extract the position information of tables in document images without losing information. However, existing techniques still fall short in detecting certain small-sized or irregularly shaped tables. To address this issue, we propose a Transformer-based table detection model. To enhance both training efficiency and prediction performance, we employ a pretrained Transformer framework for fine-tuning to effectively capture underlying features. Additionally, we integrate a dual-branch dilated context convolutional module to further improve the detection accuracy and robustness for tables of various sizes and shapes by processing high-dimensional features. Furthermore, we integrated multiple layers of residual convolutional layers to capture and fuse features at different scales, enhancing the network's ability to represent features in multi-scale feature fusion, thus enhancing the detection performance of the network. We used feature maps and heatmaps for visualization to verify the reliability of our method. We evaluate our method on publicly available document datasets, and the results demonstrate that our approach achieves more advanced performance in evaluation metrics such as Precision. https://github.com/GT-HZ/TD
引用
收藏
页码:2709 / 2720
页数:12
相关论文
共 50 条
  • [1] Pre-training transformer with dual-branch context content module for table detection in document images
    Yongzhi LI
    Pengle ZHANG
    Meng SUN
    Jin HUANG
    Ruhan HE
    虚拟现实与智能硬件(中英文), 2024, 6 (05) : 408 - 420
  • [2] Pre-training transformer with dual-branch context content module for table detection in document images
    Li, Yongzhi
    Zhang, Pengle
    Sun, Meng
    Huang, Jin
    He, Ruhan
    Virtual Reality and Intelligent Hardware, 2024, 6 (05): : 408 - 420
  • [3] Dual-Branch Adaptive Convolutional Transformer for Hyperspectral Image Classification
    Wang, Chuanzhi
    Huang, Jun
    Lv, Mingyun
    Wu, Yongmei
    Qin, Ruiru
    REMOTE SENSING, 2024, 16 (09)
  • [4] Dual-branch collaborative transformer for effective
    Qi, Xuanyu
    Song, Tianyu
    Dong, Haobo
    Jin, Jiyu
    Jin, Guiyue
    Li, Pengpeng
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 100
  • [5] Joint Classification of Hyperspectral Images and LiDAR Data Based on Dual-Branch Transformer
    Wang, Qingyan
    Zhou, Binbin
    Zhang, Junping
    Xie, Jinbao
    Wang, Yujing
    SENSORS, 2024, 24 (03)
  • [6] HTTD: A Hierarchical Transformer for Accurate Table Detection in Document Images
    Kasem, Mahmoud SalahEldin
    Mahmoud, Mohamed
    Yagoub, Bilel
    Senussi, Mostafa Farouk
    Abdalla, Mahmoud
    Kang, Hyun-Soo
    MATHEMATICS, 2025, 13 (02)
  • [7] Ship Recognition for Complex SAR Images via Dual-Branch Transformer Fusion Network
    Sun, Zhongzhen
    Leng, Xiangguang
    Zhang, Xianghui
    Xiong, Boli
    Ji, Kefeng
    Kuang, Gangyao
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21
  • [8] Dual-Branch Fourier-Mixing Transformer Network for Hyperspectral Target Detection
    Jiao, Jinyue
    Gong, Zhiqiang
    Zhong, Ping
    REMOTE SENSING, 2023, 15 (19)
  • [9] CosineTR: A dual-branch transformer-based network for semantic line detection
    Zhang, Yuqi
    Ma, Bole
    Jin, Luyang
    Yang, Yuancheng
    Tong, Chao
    PATTERN RECOGNITION, 2025, 158
  • [10] Dual-Branch Fully Convolutional Segment Anything Model for Lesion Segmentation in Endoscopic Images
    He, Dongzhi
    Ma, Zeyuan
    Li, Chenxi
    Li, Yunqi
    IEEE ACCESS, 2024, 12 : 125654 - 125667