3D Lip Event Detection via Interframe Motion Divergence at Multiple Temporal Resolutions

被引：1

作者：

Zhang, Jie ^{[1
]}

Fisher, Robert B. ^{[2
]}

机构：

[1] Beijing Technol & Business Univ, Sch Artificial Intelligence, Beijing, Peoples R China

[2] Univ Edinburgh, Sch Informat, Edinburgh, Midlothian, Scotland

来源：

2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021) | 2021年

关键词：

VOICE ACTIVITY DETECTION;

D O I：

10.1109/3DV53792.2021.00052

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The lip is a dominant dynamic facial unit when a person is speaking. Detecting lip events is beneficial to speech analysis and support for the hearing impaired. This paper proposes a 3D lip event detection pipeline that automatically determines the lip events from a 3D speaking lip sequence. We define a motion divergence measure using 3D lip landmarks to quantify the interframe dynamics of a 3D speaking lip. Then, we cast the interframe motion detection in a multi-temporal-resolution framework that allows the detection to be applicable to different speaking speeds. The experiments on the S3DFM Dataset investigate the overall 3D lip dynamics based on the proposed motion divergence. The proposed 3D pipeline is able to detect opening and closing lip events across 100 sequences, achieving a state-of-the-art performance.

引用

页码：423 / 431

页数：9

共 50 条

[31] A New Descriptor for Multiple 3D Motion Trajectories Recognition
Shao, Zhanpeng
Li, Y. F.
2013 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2013, : 4749 - 4754
[32] MOTION SEGMENTATION BASED ON 3D HISTOGRAM AND TEMPORAL MODE SELECTION
Mukherjee, Dibyendu
Wu, Q. M. Jonathan
2012 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2012, : 483 - 488
[33] 3D temporal characteristics of earthquake ground motion at single point
Tong, M
Lee, GC
JOURNAL OF ENGINEERING MECHANICS-ASCE, 1999, 125 (10): : 1099 - 1105
[34] 3D temporal characteristics of earthquake ground motion at single point
Multidisc. Ctr. Earthquake Engrg. R., 141 Ketter Hall, Stt. Univ. of New York at Buffalo, Buffalo, NY 14260, United States
不详
J Eng Mech, 10 (1099-1105):
[35] A Spatio-temporal Transformer for 3D Human Motion Prediction
Aksan, Emre
Kaufmann, Manuel
Cao, Peng
Hilliges, Otmar
2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021), 2021, : 565 - 574
[36] Underwater 3D Object Reconstruction with Multiple Views in Video Stream via Structure from Motion
Xu, Xiao
Che, Renzheng
Nian, Rui
He, Bo
Chen, Meimei
Lendasse, Amaury
OCEANS 2016 - SHANGHAI, 2016,
[37] 3D VIDEO CODING VIA MOTION COMPENSATION OF SUPERPIXELS
Milani, S.
Calvagno, G.
19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011), 2011, : 1899 - 1903
[38] 3D intuitive gesture interaction via motion sensing
Liang, Xiubo
Zhang, Shun
Li, Qilei
Zhang, Xiang
Geng, Weidong
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2010, 22 (03): : 521 - 526
[39] Continuous 3D Myocardial Motion Tracking via Echocardiography
Shen, Chengkang
Zhu, Hao
Zhou, You
Liu, Yu
Yi, Si
Dong, Lili
Zhao, Weipeng
Brady, David J.
Cao, Xun
Ma, Zhan
Lin, Yi
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2024, 43 (12) : 4236 - 4252
[40] 3D MOTION RECOVERY VIA AFFINE EPIPOLAR GEOMETRY
SHAPIRO, LS
ZISSERMAN, A
BRADY, M
INTERNATIONAL JOURNAL OF COMPUTER VISION, 1995, 16 (02) : 147 - 182

← 1 2 3 4 5 →