Audio-visual talking face detection

被引:0
|
作者
Li, MK [1 ]
Li, DG [1 ]
Dimitrova, N [1 ]
Sethi, I [1 ]
机构
[1] Oakland Univ, Intelligent Informat Eng Lab, Rochester, MI 48309 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Talking face detection is important for videoconferencing. However, the detection of the talking face is difficult because of the low resolution of the capturing devices, the informal style of communication and the background sounds. In this paper, we present a novel method for finding the talking face using latent semantic indexing approach. We tested our method on a comprehensive set of home video conferencing sessions with a very high detection rate. Our experiments show that the LSI method accuracy degrades gracefully in a noisy environment as opposed to the correlation method which simply fails in presence of noise.
引用
收藏
页码:473 / 476
页数:4
相关论文
共 50 条
  • [1] An audio-visual imposture scenario by talking face animation
    Karam, W
    Mokbel, C
    Greige, H
    Aversano, G
    Pelachaud, C
    Chollet, G
    [J]. NONLINEAR SPEECH MODELING AND APPLICATIONS, 2005, 3445 : 365 - 369
  • [2] Talking Face Generation by Adversarially Disentangled Audio-Visual Representation
    Zhou, Hang
    Liu, Yu
    Liu, Ziwei
    Luo, Ping
    Wang, Xiaogang
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 9299 - 9306
  • [3] Audio-Visual Face Reenactment
    Agarwal, Madhav
    Mukhopadhyay, Rudrabha
    Namboodiri, Vinay
    Jawahar, C. V.
    [J]. 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 5167 - 5176
  • [4] Audio-visual speech synchrony measure for talking-face identity verification
    Bredin, Herve
    Chollet, Gerard
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PTS 1-3, 2007, : 233 - +
  • [5] Arbitrary Talking Face Generation via Attentional Audio-Visual Coherence Learning
    Zhu, Hao
    Huang, Huaibo
    Li, Yi
    Zheng, Aihua
    He, Ran
    [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 2362 - 2368
  • [6] Speech2Talking-Face: Inferring and Driving a Face with Synchronized Audio-Visual Representation
    Sun, Yasheng
    Zhou, Hang
    Liu, Ziwei
    Koike, Hideki
    [J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 1018 - 1024
  • [7] Audio-Visual Face Detection for Tracking in a Meeting Room Environment
    Barnard, Mark
    Wang, Wenwu
    Kittler, Josef
    Naqvi, Syed Mohsen
    Chambers, Jonathon
    [J]. 2013 16TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2013, : 1222 - 1227
  • [8] Extreme-scale Talking-Face Video Upsampling with Audio-Visual Priors
    Hegde, Sindhu B.
    Mukhopadhyay, Rudrabha
    Namboodiri, Vinay P.
    Jawahar, C. V.
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 6511 - 6520
  • [9] Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation
    Zhou, Hang
    Sun, Yasheng
    Wu, Wayne
    Loy, Chen Change
    Wang, Xiaogang
    Liu, Ziwei
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 4174 - 4184
  • [10] AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation
    Sun, Yasheng
    Chu, Wenqing
    Zhou, Hang
    Wang, Kaisiyuan
    Koike, Hideki
    [J]. IEEE ACCESS, 2024, 12 : 57288 - 57301