Robust Face Frontalization For Visual Speech Recognition

被引:3
|
作者
Kang, Zhiqi [1 ,2 ]
Horaud, Radu [1 ,2 ]
Sadeghi, Mostafa [3 ]
机构
[1] Inria, Montbonnot St Martin, France
[2] Univ Grenoble Alpes, Montbonnot St Martin, France
[3] Inria Nancy Grand Est, Nancy, France
来源
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021) | 2021年
关键词
CLOSED-FORM SOLUTION;
D O I
10.1109/ICCVW54120.2021.00281
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Face frontalization consists of synthesizing a frontally-viewed face from an arbitrarily-viewed one. The main contribution of this paper is a robust frontalization method that preserves non-rigid facial deformations, i.e. expressions, to perform lip reading. The method iteratively estimates the rigid transformation (scale, rotation, and translation) and the non-rigid deformation between 3D landmarks extracted from an arbitrarily-viewed face, and 3D vertices parameterized by a deformable shape model. An important merit of the method is its ability to deal with large Gaussian and non-Gaussian errors in the data. For that purpose, we use the generalized Student-t distribution. The associated EM algorithm assigns a weight to each observed landmark, the higher the weight the more important the landmark, thus favoring landmarks that are only affected by rigid head movements. We propose to use the zero-mean normalized cross-correlation (ZNCC) score to evaluate the ability to preserve facial expressions. We show that the method, when incorporated into a deep lip-reading pipeline, considerably improves the word classification score on an in-the-wild benchmark.
引用
收藏
页码:2485 / 2495
页数:11
相关论文
共 50 条
  • [21] Combining two visual cortex models for robust face recognition
    Esmaili, Somayeh Saraf
    Maghooli, Keivan
    Nasrabadi, Ali Motie
    OPTIK, 2015, 126 (21): : 2818 - 2824
  • [22] Audio-Visual Efficient Conformer for Robust Speech Recognition
    Burchi, Maxime
    Timofte, Radu
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 2257 - 2266
  • [23] Expression-Preserving Face Frontalization Improves Visually Assisted Speech Processing
    Kang, Zhiqi
    Sadeghi, Mostafa
    Horaud, Radu
    Alameda-Pineda, Xavier
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 131 (05) : 1122 - 1140
  • [24] Expression-Preserving Face Frontalization Improves Visually Assisted Speech Processing
    Zhiqi Kang
    Mostafa Sadeghi
    Radu Horaud
    Xavier Alameda-Pineda
    International Journal of Computer Vision, 2023, 131 : 1122 - 1140
  • [25] Research on Robust Audio-Visual Speech Recognition Algorithms
    Yang, Wenfeng
    Li, Pengyi
    Yang, Wei
    Liu, Yuxing
    He, Yulong
    Petrosian, Ovanes
    Davydenko, Aleksandr
    MATHEMATICS, 2023, 11 (07)
  • [26] Audio-visual fuzzy fusion for robust speech recognition
    Malcangi, M.
    Ouazzane, K.
    Patel, P.
    2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013,
  • [27] A robust speech disorders correction system for Arabic language using visual speech recognition
    Farag, Ahmed
    El Adawy, Mohamed
    Ismail, Ahmed
    BIOMEDICAL RESEARCH-INDIA, 2013, 24 (02): : 185 - 192
  • [28] MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
    Anwar, Mohamed
    Shi, Bowen
    Goswami, Vedanuj
    Hsu, Wei-Ning
    Pino, Juan
    Wang, Changhan
    INTERSPEECH 2023, 2023, : 4064 - 4068
  • [29] OCCLUSION ANALYSIS FOR FACE FRONTALIZATION
    Celik, Anil
    Arica, Nafiz
    2016 4TH INTERNATIONAL SYMPOSIUM ON DIGITAL FORENSIC AND SECURITY (ISDFS), 2016, : 95 - 100
  • [30] Robust audio-visual speech recognition based on late integration
    Lee, Jong-Seok
    Park, Cheol Hoon
    IEEE TRANSACTIONS ON MULTIMEDIA, 2008, 10 (05) : 767 - 779