SketchFormer: transformer-based approach for sketch recognition using vector images

被引:3
|
作者
Parihar, Anil Singh [1 ]
Jain, Gaurav [1 ]
Chopra, Shivang [1 ]
Chopra, Suransh [1 ]
机构
[1] Delhi Technol Univ, Machine Learning Res Lab, Dept Comp Sci & Engn, New Delhi 110042, India
关键词
Sketch recognition; Transformers; Vector images; Deep learning; ALGORITHM;
D O I
10.1007/s11042-020-09837-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sketches have been employed since the ancient era of cave paintings for simple illustrations to represent real-world entities and communication. The abstract nature and varied artistic styling make automatic recognition of these drawings more challenging than other areas of image classification. Moreover, the representation of sketches as a sequence of strokes instead of raster images introduces them at the correct abstract level. However, dealing with images as a sequence of small information makes it challenging. In this paper, we propose a Transformer-based network, dubbed as AttentiveNet, for sketch recognition. This architecture incorporates ordinal information to perform the classification task in real-time through vector images. We employ the proposed model to isolate the discriminating strokes of each doodle using the attention mechanism of Transformers and perform an in-depth qualitative analysis of the isolated strokes for classification of the sketch. Experimental evaluation validates that the proposed network performs favorably against state-of-the-art techniques.
引用
收藏
页码:9075 / 9091
页数:17
相关论文
共 50 条
  • [41] A Light Transformer-Based Architecture for Handwritten Text Recognition
    Barrere, Killian
    Soullard, Yann
    Lemaitre, Aurelie
    Couasnon, Bertrand
    DOCUMENT ANALYSIS SYSTEMS, DAS 2022, 2022, 13237 : 275 - 290
  • [42] Transformer-Based adversarial network for semi-supervised face sketch synthesis
    Shi, Zhihua
    Wan, Weiguo
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 102
  • [43] A Transformer-Based Network for Dynamic Hand Gesture Recognition
    D'Eusanio, Andrea
    Simoni, Alessandro
    Pini, Stefano
    Borghi, Guido
    Vezzani, Roberto
    Cucchiara, Rita
    2020 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2020), 2020, : 623 - 632
  • [44] TRANSFORMER-BASED ACOUSTIC MODELING FOR HYBRID SPEECH RECOGNITION
    Wang, Yongqiang
    Mohamed, Abdelrahman
    Le, Duc
    Liu, Chunxi
    Xiao, Alex
    Mahadeokar, Jay
    Huang, Hongzhao
    Tjandra, Andros
    Zhang, Xiaohui
    Zhang, Frank
    Fuegen, Christian
    Zweig, Geoffrey
    Seltzer, Michael L.
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6874 - 6878
  • [45] A transformer-based cloud detection approach using Sentinel 2 imageries
    Singh, Rohit
    Biswas, Mantosh
    Pal, Mahesh
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2023, 44 (10) : 3194 - 3208
  • [46] Vision Transformer-based recognition of diabetic retinopathy grade
    Wu, Jianfang
    Hu, Ruo
    Xiao, Zhenghong
    Chen, Jiaxu
    Liu, Jingwei
    MEDICAL PHYSICS, 2021, 48 (12) : 7850 - 7863
  • [47] A Sparse Transformer-Based Approach for Image Captioning
    Lei, Zhou
    Zhou, Congcong
    Chen, Shengbo
    Huang, Yiyong
    Liu, Xianrui
    IEEE ACCESS, 2020, 8 : 213437 - 213446
  • [48] A Sparse Transformer-Based Approach for Image Captioning
    Lei, Zhou
    Zhou, Congcong
    Chen, Shengbo
    Huang, Yiyong
    Liu, Xianrui
    IEEE Access, 2020, 8 : 213437 - 213446
  • [49] A transformer-based approach to irony and sarcasm detection
    Rolandos Alexandros Potamias
    Georgios Siolas
    Andreas - Georgios Stafylopatis
    Neural Computing and Applications, 2020, 32 : 17309 - 17320
  • [50] TRANSFORMER-BASED APPROACH FOR DOCUMENT LAYOUT UNDERSTANDING
    Yang, Huichen
    Hsu, William
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 4043 - 4047