Text driven face-video synthesis using GMM and spatial correlation

被引：0

作者：

Teferi, Dereje ^{[1
]}

Faraj, Maycel L. ^{[1
]}

Bigun, Josef ^{[1
]}

机构：

[1] Halmstad Univ, Sch Informat Sci Comp & Elect Engn IDE, POB 823, SE-30118 Halmstad, Sweden

来源：

IMAGE ANALYSIS, PROCEEDINGS | 2007年 / 4522卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Liveness detection is increasingly planned to be incorporated into biometric systems to reduce the risk of spoofing and impersonation. Some of the techniques used include detection of motion of the head while posing/speaking, iris size in varying illumination, fingerprint sweat, text-prompted speech, speech-to-lip motion synchronization etc. In this paper, we propose to build a biometric signal to test attack resilience of biometric systems by creating a text-driven video synthesis of faces. We synthesize new realistic looking video sequences from real image sequences representing utterance of digits. We determine the image sequences for each digit by using a GMM based speech recognizer. Then, depending on system prompt (sequence of digits) our method regenerates a video signal to test attack resilience of a biometric system that asks for random digit utterances to prevent play-back of pre-recorded data representing both audio and images. The discontinuities in the new image sequence, created at the connection of each digit, are removed by using a frame prediction algorithm that makes use of the well known block matching algorithm. Other uses of our results include web-based video communication for electronic commerce and frame interpolation for low frame rate video.

引用

页码：572 / +

页数：3

共 45 条

[1] Time Driven Video Summarization using GMM
Sujatha, C.
Chivate, Akshay Ravindra
Ganihar, Sayed Altaf
Mudenagudi, Uma
[J]. 2013 FOURTH NATIONAL CONFERENCE ON COMPUTER VISION, PATTERN RECOGNITION, IMAGE PROCESSING AND GRAPHICS (NCVPRIPG), 2013,
[2] Face database generation based on text-video correlation
Zeng, Dan
Bao, Yixin
Liu, Ke
Zhao, Fan
Tian, Qi
[J]. NEUROCOMPUTING, 2016, 207 : 240 - 249
[3] TELL YOUR STORY: TEXT-DRIVEN FACE VIDEO SYNTHESIS WITH HIGH DIVERSITY VIA ADVERSARIAL LEARNING
Hou, Xia
Sun, Meng
Song, Wenfeng
[J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 515 - 519
[4] Video-Based Face Verification with Local Binary Patterns and SVM Using GMM Supervectors
Pereira, Tiago F.
Angeloni, Marcus A.
Simoes, Flavio O.
Silva, Jose Eduardo C.
[J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2012, PT I, 2012, 7333 : 240 - 252
[5] TEXT2VIDEO: TEXT-DRIVEN TALKING-HEAD VIDEO SYNTHESIS WITH PERSONALIZED PHONEME - POSE DICTIONARY
Zhang, Sibo
Yuan, Jiahong
Liao, Miao
Zhang, Liangjun
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2659 - 2663
[6] Speech-driven face synthesis from 3D video
Ypsilos, LA
Hilton, A
Turkmani, A
Jackson, PJB
[J]. 2ND INTERNATIONAL SYMPOSIUM ON 3D DATA PROCESSING, VISUALIZATION, AND TRANSMISSION, PROCEEDINGS, 2004, : 58 - 65
[7] Video Face Editing Using Temporal-Spatial-Smooth Warping
Li, Xiaoyan
Liu, Tongliang
Deng, Jiankang
Tao, Dacheng
[J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2016, 7 (03)
[8] Modelling Video Frames for Object Extraction Using Spatial Correlation
Ray, Vinayak
Sircar, Pradip
[J]. APPLIED SOFT COMPUTING AND COMMUNICATION NETWORKS, 2021, 187 : 281 - 299
[9] Speaker-independent 3D face synthesis driven by speech and text
Savran, Arman
Arslan, Levent M.
Akarun, Lale
[J]. SIGNAL PROCESSING, 2006, 86 (10) : 2932 - 2951
[10] Text2Video: Text-driven facial animation using MPEG-4
Rurainsky, J
Eisert, P
[J]. VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2005, PTS 1-4, 2005, 5960 : 492 - 500

← 1 2 3 4 5 →