3D Lip Event Detection via Interframe Motion Divergence at Multiple Temporal Resolutions

被引：1

作者：

Zhang, Jie ^{[1
]}

Fisher, Robert B. ^{[2
]}

机构：

[1] Beijing Technol & Business Univ, Sch Artificial Intelligence, Beijing, Peoples R China

[2] Univ Edinburgh, Sch Informat, Edinburgh, Midlothian, Scotland

来源：

2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021) | 2021年

关键词：

VOICE ACTIVITY DETECTION;

D O I：

10.1109/3DV53792.2021.00052

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The lip is a dominant dynamic facial unit when a person is speaking. Detecting lip events is beneficial to speech analysis and support for the hearing impaired. This paper proposes a 3D lip event detection pipeline that automatically determines the lip events from a 3D speaking lip sequence. We define a motion divergence measure using 3D lip landmarks to quantify the interframe dynamics of a 3D speaking lip. Then, we cast the interframe motion detection in a multi-temporal-resolution framework that allows the detection to be applicable to different speaking speeds. The experiments on the S3DFM Dataset investigate the overall 3D lip dynamics based on the proposed motion divergence. The proposed 3D pipeline is able to detect opening and closing lip events across 100 sequences, achieving a state-of-the-art performance.

引用

页码：423 / 431

页数：9

共 50 条

[21] Temporal Shape Transfer Network for 3D Human Motion
Regateiro, Joao
Boyer, Edmond
2022 INTERNATIONAL CONFERENCE ON 3D VISION, 3DV, 2022, : 424 - 432
[22] Spatio-Temporal Reconstruction for 3D Motion Recovery
Yang, Jingyu
Guo, Xin
Li, Kun
Wang, Meiyuan
Lai, Yu-Kun
Wu, Feng
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (06) : 1583 - 1596
[23] Joint direct estimation of 3D geometry and 3D motion using spatio temporal gradients
Barranco, Francisco
Fermueller, Cornelia
Aloimonos, Yiannis
Ros, Eduardo
PATTERN RECOGNITION, 2021, 113 (113)
[24] 3D Object Detection with Multiple Kinects
Susanto, Wandi
Rohrbach, Marcus
Schiele, Bernt
COMPUTER VISION - ECCV 2012, PT II, 2012, 7584 : 93 - 102
[25] SF3D: SlowFast Temporal 3D Object Detection
Wang, Renhao
Yu, Zhiding
Lan, Shiyi
Xie, Enze
Chen, Ke
Anandkumar, Anima
Alvarez, Jose M.
2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 1280 - 1285
[26] Semi-supervised 3D Object Detection via Temporal Graph Neural Networks
Wang, Jianren
Gang, Haiming
Ancha, Siddarth
Chen, Yi-Ting
Held, David
2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021), 2021, : 413 - 422
[27] 3DSVHT:: Extraction of 3D linear motion via multi-view, temporal evidence accumulation
Artolazábal, JAR
Illingworth, J
ADVANCED CONCEPTS FOR INTELLIGENT VISION SYSTEMS, PROCEEDINGS, 2005, 3708 : 563 - 570
[28] Fast 3D reconstruction via event-based structured light with spatio-temporal coding
Fu, Jiacheng
Zhang, Yueyi
Li, Yue
Li, Jiacheng
Xiong, Zhiwei
OPTICS EXPRESS, 2023, 31 (26) : 44588 - 44602
[29] Lip Reading Using Deformable 3D Convolution and Channel-Temporal Attention
Peng, Chen
Li, Jun
Chai, Jie
Zhao, Zhongqiu
Zhang, Housen
Tian, Weidong
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT IV, 2022, 13532 : 707 - 718
[30] 3D Human Motion Sensing from Multiple Cameras
Nordin, Nadira
Soori, Umair
Arshad, Mohd Rizal
ICIAS 2007: INTERNATIONAL CONFERENCE ON INTELLIGENT & ADVANCED SYSTEMS, VOLS 1-3, PROCEEDINGS, 2007, : 325 - 329

← 1 2 3 4 5 →