3D Lip Event Detection via Interframe Motion Divergence at Multiple Temporal Resolutions

被引:1
|
作者
Zhang, Jie [1 ]
Fisher, Robert B. [2 ]
机构
[1] Beijing Technol & Business Univ, Sch Artificial Intelligence, Beijing, Peoples R China
[2] Univ Edinburgh, Sch Informat, Edinburgh, Midlothian, Scotland
关键词
VOICE ACTIVITY DETECTION;
D O I
10.1109/3DV53792.2021.00052
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The lip is a dominant dynamic facial unit when a person is speaking. Detecting lip events is beneficial to speech analysis and support for the hearing impaired. This paper proposes a 3D lip event detection pipeline that automatically determines the lip events from a 3D speaking lip sequence. We define a motion divergence measure using 3D lip landmarks to quantify the interframe dynamics of a 3D speaking lip. Then, we cast the interframe motion detection in a multi-temporal-resolution framework that allows the detection to be applicable to different speaking speeds. The experiments on the S3DFM Dataset investigate the overall 3D lip dynamics based on the proposed motion divergence. The proposed 3D pipeline is able to detect opening and closing lip events across 100 sequences, achieving a state-of-the-art performance.
引用
收藏
页码:423 / 431
页数:9
相关论文
共 50 条
  • [31] A New Descriptor for Multiple 3D Motion Trajectories Recognition
    Shao, Zhanpeng
    Li, Y. F.
    2013 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2013, : 4749 - 4754
  • [32] MOTION SEGMENTATION BASED ON 3D HISTOGRAM AND TEMPORAL MODE SELECTION
    Mukherjee, Dibyendu
    Wu, Q. M. Jonathan
    2012 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2012, : 483 - 488
  • [33] 3D temporal characteristics of earthquake ground motion at single point
    Tong, M
    Lee, GC
    JOURNAL OF ENGINEERING MECHANICS-ASCE, 1999, 125 (10): : 1099 - 1105
  • [34] 3D temporal characteristics of earthquake ground motion at single point
    Multidisc. Ctr. Earthquake Engrg. R., 141 Ketter Hall, Stt. Univ. of New York at Buffalo, Buffalo, NY 14260, United States
    不详
    J Eng Mech, 10 (1099-1105):
  • [35] A Spatio-temporal Transformer for 3D Human Motion Prediction
    Aksan, Emre
    Kaufmann, Manuel
    Cao, Peng
    Hilliges, Otmar
    2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021), 2021, : 565 - 574
  • [36] Underwater 3D Object Reconstruction with Multiple Views in Video Stream via Structure from Motion
    Xu, Xiao
    Che, Renzheng
    Nian, Rui
    He, Bo
    Chen, Meimei
    Lendasse, Amaury
    OCEANS 2016 - SHANGHAI, 2016,
  • [37] 3D VIDEO CODING VIA MOTION COMPENSATION OF SUPERPIXELS
    Milani, S.
    Calvagno, G.
    19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011), 2011, : 1899 - 1903
  • [38] 3D intuitive gesture interaction via motion sensing
    Liang, Xiubo
    Zhang, Shun
    Li, Qilei
    Zhang, Xiang
    Geng, Weidong
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2010, 22 (03): : 521 - 526
  • [39] Continuous 3D Myocardial Motion Tracking via Echocardiography
    Shen, Chengkang
    Zhu, Hao
    Zhou, You
    Liu, Yu
    Yi, Si
    Dong, Lili
    Zhao, Weipeng
    Brady, David J.
    Cao, Xun
    Ma, Zhan
    Lin, Yi
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2024, 43 (12) : 4236 - 4252
  • [40] 3D MOTION RECOVERY VIA AFFINE EPIPOLAR GEOMETRY
    SHAPIRO, LS
    ZISSERMAN, A
    BRADY, M
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 1995, 16 (02) : 147 - 182