Robust Face Frontalization For Visual Speech Recognition

被引：3

作者：

Kang, Zhiqi ^{[1
,2
]}

Horaud, Radu ^{[1
,2
]}

Sadeghi, Mostafa ^{[3
]}

机构：

[1] Inria, Montbonnot St Martin, France

[2] Univ Grenoble Alpes, Montbonnot St Martin, France

[3] Inria Nancy Grand Est, Nancy, France

来源：

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021) | 2021年

关键词：

CLOSED-FORM SOLUTION;

D O I：

10.1109/ICCVW54120.2021.00281

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Face frontalization consists of synthesizing a frontally-viewed face from an arbitrarily-viewed one. The main contribution of this paper is a robust frontalization method that preserves non-rigid facial deformations, i.e. expressions, to perform lip reading. The method iteratively estimates the rigid transformation (scale, rotation, and translation) and the non-rigid deformation between 3D landmarks extracted from an arbitrarily-viewed face, and 3D vertices parameterized by a deformable shape model. An important merit of the method is its ability to deal with large Gaussian and non-Gaussian errors in the data. For that purpose, we use the generalized Student-t distribution. The associated EM algorithm assigns a weight to each observed landmark, the higher the weight the more important the landmark, thus favoring landmarks that are only affected by rigid head movements. We propose to use the zero-mean normalized cross-correlation (ZNCC) score to evaluate the ability to preserve facial expressions. We show that the method, when incorporated into a deep lip-reading pipeline, considerably improves the word classification score on an in-the-wild benchmark.

引用

页码：2485 / 2495

页数：11

共 50 条

[21] Combining two visual cortex models for robust face recognition
Esmaili, Somayeh Saraf
Maghooli, Keivan
Nasrabadi, Ali Motie
OPTIK, 2015, 126 (21): : 2818 - 2824
[22] Audio-Visual Efficient Conformer for Robust Speech Recognition
Burchi, Maxime
Timofte, Radu
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 2257 - 2266
[23] Expression-Preserving Face Frontalization Improves Visually Assisted Speech Processing
Kang, Zhiqi
Sadeghi, Mostafa
Horaud, Radu
Alameda-Pineda, Xavier
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 131 (05) : 1122 - 1140
[24] Expression-Preserving Face Frontalization Improves Visually Assisted Speech Processing
Zhiqi Kang
Mostafa Sadeghi
Radu Horaud
Xavier Alameda-Pineda
International Journal of Computer Vision, 2023, 131 : 1122 - 1140
[25] Research on Robust Audio-Visual Speech Recognition Algorithms
Yang, Wenfeng
Li, Pengyi
Yang, Wei
Liu, Yuxing
He, Yulong
Petrosian, Ovanes
Davydenko, Aleksandr
MATHEMATICS, 2023, 11 (07)
[26] Audio-visual fuzzy fusion for robust speech recognition
Malcangi, M.
Ouazzane, K.
Patel, P.
2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013,
[27] A robust speech disorders correction system for Arabic language using visual speech recognition
Farag, Ahmed
El Adawy, Mohamed
Ismail, Ahmed
BIOMEDICAL RESEARCH-INDIA, 2013, 24 (02): : 185 - 192
[28] MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
Anwar, Mohamed
Shi, Bowen
Goswami, Vedanuj
Hsu, Wei-Ning
Pino, Juan
Wang, Changhan
INTERSPEECH 2023, 2023, : 4064 - 4068
[29] OCCLUSION ANALYSIS FOR FACE FRONTALIZATION
Celik, Anil
Arica, Nafiz
2016 4TH INTERNATIONAL SYMPOSIUM ON DIGITAL FORENSIC AND SECURITY (ISDFS), 2016, : 95 - 100
[30] Robust audio-visual speech recognition based on late integration
Lee, Jong-Seok
Park, Cheol Hoon
IEEE TRANSACTIONS ON MULTIMEDIA, 2008, 10 (05) : 767 - 779

← 1 2 3 4 5 →