Enabling Translatability of Generative Face Video Coding: A Unified Face Feature Transcoding Framework

被引：2

作者：

Yin, Shanzhi ^{[1
]}

Chen, Bolin ^{[1
]}

Wang, Shiqi ^{[1
]}

Ye, Yan ^{[2
]}

机构：

[1] City Univ Hong Kong, Hong Kong, Peoples R China

[2] Alibaba Grp, Hangzhou, Peoples R China

来源：

2024 DATA COMPRESSION CONFERENCE, DCC | 2024年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/DCC58796.2024.00019

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Generative face video coding (GFVC) can achieve high-quality visual face communication at ultra-low bit-rate ranges via strong facial prior learning and realistic generation. However, different kinds of feature representations hinder the interoperability of GFVC, as the bitstream generated from one type of feature representation can only be correctly understood by the corresponding decoder. In this paper, we make the first attempt to propose a face feature transcoding framework that enables translatability in GFVC. By integrating a face feature transcoder at the decoder side, received face features can be translated to decoder-specific ones for subsequent face reconstruction. Furthermore, the translation between different types of face features can be achieved using a unified transcoding framework, facilitating seamless interoperability between different facial representations and their associated decoders. Experimental results demonstrate that three main-stream GFVC codecs, each utilizing different face features, can be effectively adapted to one another while retaining promising coding performance, largely extending the generality of the GFVC system. The project page can be found at https://github.com/xyzysz/GFVC_Software-Decoder_Interoperability.

引用

页码：113 / 122

页数：10

共 50 条

[41] A unified learning framework for real time face detection and classification
Shakhnarovich, G
Viola, PA
Moghaddam, B
FIFTH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, PROCEEDINGS, 2002, : 16 - 23
[42] DYNAMIC MULTI-REFERENCE GENERATIVE PREDICTION FOR FACE VIDEO COMPRESSION
Wang, Zhao
Chen, Bolin
Ye, Yan
Wang, Shiqi
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 896 - 900
[43] A Unified Framework for High Fidelity Face Swap and Expression Reenactment
Peng, Bo
Fan, Hongxing
Wang, Wei
Dong, Jing
Lyu, Siwei
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (06) : 3673 - 3684
[44] Differential video coding of face and gesture events in presentation videos
Tan, R
Davis, JW
COMPUTER VISION AND IMAGE UNDERSTANDING, 2004, 96 (02) : 200 - 215
[45] Region-of-interest video coding based on face detection
Chen, JW
Chen, MJ
Chi, MC
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2002, PROCEEDING, 2002, 2532 : 1201 - 1211
[46] Low-complexity face-assisted video coding
Lin, CW
Chang, YJ
Chen, YC
2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL II, PROCEEDINGS, 2000, : 207 - 210
[47] Face Detection and Feature Points Location and Tracking in Video Sequence
Zhang Xiaowe
Zhang Wenjun
FIFTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2013), 2013, 8878
[48] A feature map aggregation network for unconstrained video face recognition
Zhang, Luyang
Wang, Huaibin
Wang, Haitao
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (02) : 2413 - 2425
[49] Feature fusion of face and gait for human recognition at a distance in video
Zhou, Xiaoli
Bhanu, Bir
18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS, 2006, : 529 - +
[50] Personalized Face Reference from Video: Key-Face Selection and Feature-Level Fusion
Damer, Naser
Samartzidis, Timotheos
Nouak, Alexander
FACE AND FACIAL EXPRESSION RECOGNITION FROM REAL WORLD VIDEOS, 2015, 8912 : 85 - 98

← 1 2 3 4 5 →