Live Speech Driven Head-and-Eye Motion Generators

被引:50
|
作者
Le, Binh H. [1 ]
Ma, Xiaohan [1 ]
Deng, Zhigang
机构
[1] Univ Houston, Dept Comp Sci, Comp Graph Lab, Houston, TX 77204 USA
基金
美国国家科学基金会;
关键词
Facial animation; head and eye motion coupling; head motion synthesis; gaze synthesis; blinking model; live speech driven; ANIMATION; CAPTURE; MODEL; GAZE; PATTERNS; PROSODY; FACES;
D O I
10.1109/TVCG.2012.74
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper describes a fully automated framework to generate realistic head motion, eye gaze, and eyelid motion simultaneously based on live (or recorded) speech input. Its central idea is to learn separate yet interrelated statistical models for each component (head motion, gaze, or eyelid motion) from a prerecorded facial motion data set: 1) Gaussian Mixture Models and gradient descent optimization algorithm are employed to generate head motion from speech features; 2) Nonlinear Dynamic Canonical Correlation Analysis model is used to synthesize eye gaze from head motion and speech features, and 3) nonnegative linear regression is used to model voluntary eye lid motion and log-normal distribution is used to describe involuntary eye blinks. Several user studies are conducted to evaluate the effectiveness of the proposed speech-driven head and eye motion generator using the well-established paired comparison methodology. Our evaluation results clearly show that this approach can significantly outperform the state-of-the-art head and eye motion generation algorithms. In addition, a novel mocap+video hybrid data acquisition technique is introduced to record high-fidelity head movement, eye gaze, and eyelid motion simultaneously.
引用
收藏
页码:1902 / 1914
页数:13
相关论文
共 50 条
  • [21] Linking facial animation, head motion and speech acoustics
    Yehia, HC
    Kuratate, T
    Vatikiotis-Bateson, E
    JOURNAL OF PHONETICS, 2002, 30 (03) : 555 - 568
  • [22] Development of eye pointer with free head-motion
    Kuno, Y
    Yagi, T
    Uchikawa, Y
    PROCEEDINGS OF THE 20TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOL 20, PTS 1-6: BIOMEDICAL ENGINEERING TOWARDS THE YEAR 2000 AND BEYOND, 1998, 20 : 1750 - 1752
  • [23] The perception of motion smear during eye and head movements
    Bedell, Harold E.
    Tong, Jianliang
    Aydin, Murat
    VISION RESEARCH, 2010, 50 (24) : 2692 - 2701
  • [24] Constrained optimization for a speech-driven talking head
    Choi, KH
    Lee, JH
    PROCEEDINGS OF THE 2003 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL II: COMMUNICATIONS-MULTIMEDIA SYSTEMS & APPLICATIONS, 2003, : 560 - 563
  • [25] Meaningful head movements driven by emotional synthetic speech
    Sadoughi, Najmeh
    Liu, Yang
    Busso, Carlos
    SPEECH COMMUNICATION, 2017, 95 : 87 - 99
  • [26] BSDEs driven by fractional Brownian motion with time-delayed generators
    Aidara, Sadibou
    Sylla, Lamine
    APPLICABLE ANALYSIS, 2024, 103 (04) : 724 - 733
  • [27] A Practical Model for Live Speech-Driven Lip-Sync
    Wei, Li
    Deng, Zhigang
    IEEE COMPUTER GRAPHICS AND APPLICATIONS, 2015, 35 (02) : 70 - 78
  • [28] Rigid head motion in expressive speech animation: Analysis and synthesis
    Busso, Carlos
    Deng, Zhigang
    Grimm, Michael
    Neumann, Ulrich
    Narayanan, Shrikanth
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 1075 - 1086
  • [29] Monitoring head/eye motion for driver alertness with one camera
    Smith, P
    Shah, M
    Lobo, ND
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS: APPLICATIONS, ROBOTICS SYSTEMS AND ARCHITECTURES, 2000, : 636 - 642
  • [30] On coordinated head-eye motion control of a humanoid robot
    Department of Automation, Shanghai Jiaotong University, Shanghai 200240, China
    Jiqiren, 2008, 2 (165-170):