LDRNet: Enabling Real-Time Document Localization on Mobile Devices

被引:0
|
作者
Wu, Han [1 ]
Qian, Holland [2 ]
Wu, Huaming [3 ]
van Moorsel, Aad [4 ]
机构
[1] Newcastle Univ, Newcastle Upon Tyne, Tyne & Wear, England
[2] Tencent, Shenzhen, Peoples R China
[3] Tianjin Univ, Tianjin, Peoples R China
[4] Univ Birmingham, Birmingham, W Midlands, England
关键词
Document localization; Real time; Mobile devices;
D O I
10.1007/978-3-031-23618-1_42
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modern online services often require mobile devices to convert paper-based information into its digital counterpart, e.g., passport, ownership documents, etc. This process relies on Document Localization (DL) technology to detect the outline of a document within a photograph. In recent years, increased demand for real-time DL in live video has emerged, especially in financial services. However, existing machinelearning approaches to DL cannot be easily applied due to the large size of the underlying models and the associated long inference time. In this paper, we propose a lightweight DL model, LDRNet, to localize documents in real-time video captured on mobile devices. On the basis of a lightweight backbone neural network, we design three prediction branches for LDRNet: (1) corner points prediction; (2) line borders prediction and (3) document classification. To improve the accuracy, we design novel supplementary targets, the equal-division points, and use a new loss function named Line Loss. We compare the performance of LDRNet with other popular approaches on localization for general documents in a number of datasets. The experimental results show that LDRNet takes significantly less inference time, while still achieving comparable accuracy.
引用
收藏
页码:618 / 629
页数:12
相关论文
共 50 条
  • [21] Real-time double JPEG forensics for mobile devices
    Agarwal, Aanchal
    Gupta, Abhinav
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2022, 19 (04) : 727 - 737
  • [22] Real-Time Neural Light Field on Mobile Devices
    Cau, Junli
    Wang, Huan
    Chemerys, Pavlo
    Shakhrai, Vladislav
    Hu, Ju
    Fu, Yun
    Makoviichuk, Denys
    Tulyakov, Sergey
    Ren, Jian
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 8328 - 8337
  • [23] A Real-Time Visual Card Reader for Mobile Devices
    Stehr, Lukas
    Meusel, Robert
    Kopf, Stephan
    2016 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2016), 2016,
  • [24] Real-time Short Video Recommendation on Mobile Devices
    Gong, Xudong
    Feng, Qinlin
    Zhang, Yuan
    Qin, Jiangling
    Ding, Weijie
    Li, Biao
    Jiang, Peng
    Gai, Kun
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 3103 - 3112
  • [25] Real-Time Traffic Counter Using Mobile Devices
    P. S. Arun Sooraj
    Varghese Kollerathu
    Vinay Sudhakaran
    Journal of Big Data Analytics in Transportation, 2021, 3 (2): : 109 - 118
  • [26] Practical real-time video codec for mobile devices
    Yu, KM
    Lv, JB
    Li, J
    Li, SP
    2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL III, PROCEEDINGS, 2003, : 509 - 512
  • [27] Real-Time VoIP Quality Measurement for Mobile Devices
    Chen, Whai-En
    Lin, Pin-Jen
    Lin, Yi-Bing
    IEEE SYSTEMS JOURNAL, 2011, 5 (04): : 538 - 544
  • [28] Enabling real-time and high accuracy tracking with COTS RFID devices
    Zhao, Kai
    Li, Binghao
    INTERNATIONAL JOURNAL OF IMAGE AND DATA FUSION, 2020, 11 (04) : 251 - 267
  • [29] A Sophisticated Mechanism for Enabling Real-time Mobile Access to PHR Data
    Koufi, Vassiliki
    Malamateniou, Flora
    Vassilacopoulos, George
    INFORMATICS, MANAGEMENT AND TECHNOLOGY IN HEALTHCARE, 2013, 190 : 148 - 150
  • [30] TeleEye: Enabling Real-time Geospatial Query Answering with Mobile Crowd
    Fan, Yao-Chung
    Iam, Cheng Teng
    Syu, Gia Hao
    Lee, Wei Hong
    2013 9TH IEEE INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING IN SENSOR SYSTEMS (IEEE DCOSS 2013), 2013, : 323 - 324