Asymmetry-aware bilinear pooling in multi-modal data for head pose estimation

被引:2
|
作者
Chen, Jiazhong [1 ]
Li, Qingqing [1 ]
Ren, Dakai [2 ]
Cao, Hua [1 ]
Ling, Hefei [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Wuhan, Peoples R China
[2] Beijing Univ Posts & Telecommun, Sch Informat & Commun Engn, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Head pose estimation; Asymmetry-aware; Bilinear pooling; ATTENTION; REPRESENTATION; NETWORK;
D O I
10.1016/j.image.2022.116895
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The head pose on roll and yaw directions is decided by the asymmetric appearance in human faces, and the contextual information of asymmetric appearance is encoded in a head pose related neighborhood. However, CNNs used in existing head pose estimation methods often evenly performs on the features of full image. Thus it is hard to collect the contextual information of such asymmetric appearance by those methods. To address this issue, this paper proposes a novel head pose estimation method that could perceive the asymmetric appearance in human faces. Specifically, the awareness of such asymmetry is undertaken by the local pairwise feature interaction in head pose related neighborhood via bilinear pooling. Evaluations on two public datasets demonstrate that our method could achieve promising results.
引用
下载
收藏
页数:10
相关论文
共 50 条
  • [31] Soft multi-modal data fusion
    Coppock, S
    Mazack, L
    PROCEEDINGS OF THE 12TH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1 AND 2, 2003, : 636 - 641
  • [32] Methods of Multi-Modal Data Exploration
    Grosup, Tomas
    ICMR'19: PROCEEDINGS OF THE 2019 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2019, : 34 - 37
  • [33] Multi-modal data fusion: A description
    Coppock, S
    Mazlack, LJ
    KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 2, PROCEEDINGS, 2004, 3214 : 1136 - 1142
  • [34] Interpretable multi-modal data integration
    Osorio, Daniel
    NATURE COMPUTATIONAL SCIENCE, 2022, 2 (01): : 8 - 9
  • [35] Interpretable multi-modal data integration
    Daniel Osorio
    Nature Computational Science, 2022, 2 : 8 - 9
  • [36] PhoCaL: A Multi-Modal Dataset for Category-Level Object Pose Estimation with Photometrically Challenging Objects
    Wang, Pengyuan
    Jung, HyunJun
    Li, Yitong
    Shen, Siyuan
    Srikanth, Rahul Parthasarathy
    Garattoni, Lorenzo
    Meier, Sven
    Navab, Nassir
    Busam, Benjamin
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 21190 - 21199
  • [37] Fruity: A Multi-modal Dataset for Fruit Recognition and 6D-Pose Estimation in Precision Agriculture
    Abdulsalam, Mahmoud
    Chekakta, Zakaria
    Aouf, Nabil
    Hogan, Maxwell
    2023 31ST MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION, MED, 2023, : 144 - 149
  • [38] Multi-modal 3D Human Pose Estimation for Human-Robot Collaborative Applications
    Peppas, Konstantinos
    Tsiolis, Konstantinos
    Mariolis, Ioannis
    Topalidou-Kyniazopoulou, Angeliki
    Tzovaras, Dimitrios
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2020, 2021, 12644 : 355 - 364
  • [39] Multi-modal Human pose estimation based on probability distribution perception on a depth convolution neural network
    Wang, Xunjun
    Hu, Xiaochun
    Li, Yun
    Jiang, Caoqing
    PATTERN RECOGNITION LETTERS, 2022, 153 : 36 - 43
  • [40] Towards Multi-modal Self-supervised Video and Ultrasound Pose Estimation for Laparoscopic Liver Surgery
    Montana-Brown, Nina
    Ramalhinho, Joao
    Koo, Bongjin
    Allam, Moustafa
    Davidson, Brian
    Gurusamy, Kurinchi
    Hu, Yipeng
    Clarkson, Matthew J.
    SIMPLIFYING MEDICAL ULTRASOUND, ASMUS 2022, 2022, 13565 : 183 - 192