Live Speech Driven Head-and-Eye Motion Generators

被引:50
|
作者
Le, Binh H. [1 ]
Ma, Xiaohan [1 ]
Deng, Zhigang
机构
[1] Univ Houston, Dept Comp Sci, Comp Graph Lab, Houston, TX 77204 USA
基金
美国国家科学基金会;
关键词
Facial animation; head and eye motion coupling; head motion synthesis; gaze synthesis; blinking model; live speech driven; ANIMATION; CAPTURE; MODEL; GAZE; PATTERNS; PROSODY; FACES;
D O I
10.1109/TVCG.2012.74
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper describes a fully automated framework to generate realistic head motion, eye gaze, and eyelid motion simultaneously based on live (or recorded) speech input. Its central idea is to learn separate yet interrelated statistical models for each component (head motion, gaze, or eyelid motion) from a prerecorded facial motion data set: 1) Gaussian Mixture Models and gradient descent optimization algorithm are employed to generate head motion from speech features; 2) Nonlinear Dynamic Canonical Correlation Analysis model is used to synthesize eye gaze from head motion and speech features, and 3) nonnegative linear regression is used to model voluntary eye lid motion and log-normal distribution is used to describe involuntary eye blinks. Several user studies are conducted to evaluate the effectiveness of the proposed speech-driven head and eye motion generator using the well-established paired comparison methodology. Our evaluation results clearly show that this approach can significantly outperform the state-of-the-art head and eye motion generation algorithms. In addition, a novel mocap+video hybrid data acquisition technique is introduced to record high-fidelity head movement, eye gaze, and eyelid motion simultaneously.
引用
收藏
页码:1902 / 1914
页数:13
相关论文
共 50 条
  • [1] Context-Aware Head-and-Eye Motion Generation with Diffusion Model
    Shen, Yuxin
    Xu, Manjie
    Liang, Wei
    2024 IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES, VR 2024, 2024, : 157 - 167
  • [2] Speech-driven head motion generation from waveforms
    Lu, JinHong
    Shimodaira, Hiroshi
    SPEECH COMMUNICATION, 2024, 159
  • [3] Head Motion Generation with Synthetic Speech: a Data Driven Approach
    Sadoughi, Najmeh
    Busso, Carlos
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 52 - 56
  • [4] Articulatory features for speech-driven head motion synthesis
    Ben-Youssef, Atef
    Shimodaira, Hiroshi
    Braude, David A.
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2757 - 2761
  • [5] BLSTM Neural Networks for Speech Driven Head Motion Synthesis
    Ding, Chuang
    Zhu, Pengcheng
    Xie, Lei
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3345 - 3349
  • [6] Head motion generation for speech-driven talking avatar
    Xie, L. (lxie@nwpu.edu.cn), 1600, Tsinghua University (53):
  • [7] Template-Warping Based Speech Driven Head Motion Synthesis
    Braude, David Adam
    Shimodaira, Hiroshi
    Ben Youssef, Atef
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2762 - 2766
  • [8] S3: Speech, Script and Scene driven Head and Eye Animation
    Pan, Yifang
    Agrawal, Rishabh
    Singh, Karan
    ACM TRANSACTIONS ON GRAPHICS, 2024, 43 (04):
  • [9] Experimental Study on the Imitation of the Human Head-and-Eye Pose Using the 3-DOF Agile Eye Parallel Robot with ROS and Mediapipe Framework
    Radmehr, Amirmohammad
    Asgari, Milad
    Masouleh, Mehdi Tale
    2021 9TH RSI INTERNATIONAL CONFERENCE ON ROBOTICS AND MECHATRONICS (ICROM), 2021, : 472 - 478
  • [10] Analysis of head motions and speech, and head motion control in an android
    Ishi, Carlos T.
    Haas, Judith
    Wilbers, Freerk P.
    Ishiguro, Hiroshi
    Hagita, Norihiro
    2007 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-9, 2007, : 554 - +