TransMarker: A Pure Vision Transformer for Facial Landmark Detection

被引:2
|
作者
Wu, Wenyan [1 ]
Cai, Yici [1 ]
Zhou, Qiang [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing Natl Res Ctr Informat Sci & Technol BNRis, Beijing, Peoples R China
关键词
D O I
10.1109/ICPR56361.2022.9956248
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent years, Convolution Neural Networks (CNNs) have achieved impressive results in facial landmark detection task. Especially, the u-shaped architecture, also known as Unet, has become the de-facto standard and achieved tremendous success. However, due to the locality property of convolution operation, it has a limitation in modeling global and long-range semantic information interaction, which is essential in localization tasks. In this work, we propose a Unet-like pure transformer method TransMarker, in which we give a new perspective to tackle facial landmark detection task in a sequence-to-sequence manner. We first split the input image into non-overlapping patches, which are seen as tokens in NLP tasks. Then, we feed the image patches into a symmetric u-shaped Encoder-Decoder architecture for local-global semantic feature learning. In addition, we introduce a Dense Skip-Connection schema to leverage the multi-level information within different resolutions. Note that, unlike conventional U-net architecture, we design the network with pure Transformer blocks, without any conventional operations. Extensive experiments demonstrate the state-of-the-art performance of our method on several standard datasets, i.e., WFLW, COFW and 300W, which remarkably outperform previous convolutional-based methods.
引用
收藏
页码:3580 / 3587
页数:8
相关论文
共 50 条
  • [31] EFFICIENT FACIAL LANDMARK DETECTION FOR EMBEDDED SYSTEMS
    Wu, Ji-Jia
    2024 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS, ICMEW 2024, 2024,
  • [32] Benchmarking Shadow Removal for Facial Landmark Detection
    Fu, Lan
    Guo, Qing
    Juefei-Xu, Felix
    Yu, Hongkai
    Liu, Yang
    Feng, Wei
    Wang, Song
    2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 265 - 271
  • [33] Accurate Facial Landmark Detector via Multi-scale Transformer
    Sha, Yuyang
    Meng, Weiyu
    Zhai, Xiaobing
    Xie, Can
    Li, Kefeng
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT V, 2024, 14429 : 278 - 290
  • [34] Colonoscopy Landmark Detection Using Vision Transformers
    Tamhane, Aniruddha
    Mida, Tse'ela
    Posner, Erez
    Bouhnik, Moshe
    IMAGING SYSTEMS FOR GI ENDOSCOPY, AND GRAPHS IN BIOMEDICAL IMAGE ANALYSIS, ISGIE 2022, 2022, 13754 : 24 - 34
  • [35] GGViT:Multistream Vision Transformer Network in Face2Face Facial Reenactment Detection
    Wu, Haotian
    Wang, Peipei
    Wang, Xin
    Xiang, Ji
    Gong, Rui
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2335 - 2341
  • [36] Facial Expression Recognition Based on Squeeze Vision Transformer
    Kim, Sangwon
    Nam, Jaeyeal
    Ko, Byoung Chul
    SENSORS, 2022, 22 (10)
  • [37] POSE INVARIANT FACIAL COMPONENT-LANDMARK DETECTION
    Efraty, B.
    Papadakis, M.
    Profitt, A.
    Shah, S.
    Kakadiaris, I. A.
    2011 18TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2011, : 569 - 572
  • [38] LIGHTWEIGHT FACIAL LANDMARK DETECTION WITH WEAKLY SUPERVISED LEARNING
    Lai, Shenqi
    Liu, Lei
    Chai, Zhenhua
    Wei, Xiaolin
    2021 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2021,
  • [39] Soft Facial Landmark Detection by Label Distribution Learning
    Su, Kai
    Geng, Xin
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 5008 - 5015
  • [40] Facial landmark detection using artificial intelligence techniques
    Zhongshan, Chen
    Xinning, Feng
    Manickam, Adhiyaman
    Sathishkumar, V. E.
    ANNALS OF OPERATIONS RESEARCH, 2023, 326 (SUPPL 1) : 63 - 63