CLIP-Cluster: CLIP-Guided Attribute Hallucination for Face Clustering

被引:4
|
作者
Shen, Shuai [1 ,2 ]
Li, Wanhua [1 ,2 ]
Wang, Xiaobing [3 ]
Zhang, Dafeng [3 ]
Jin, Zhezhu [3 ]
Zhou, Jie [1 ,2 ]
Lu, Jiwen [1 ,2 ]
机构
[1] Tsinghua Univ, Dept Automat, Beijing, Peoples R China
[2] Beijing Natl Res Ctr Informat Sci & Technol, Beijing, Peoples R China
[3] Samsung Res China Beijing, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICCV51070.2023.01900
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the most important yet rarely studied challenges for supervised face clustering is the large intra-class variance caused by different face attributes such as age, pose, and expression. Images of the same identity but with different face attributes usually tend to be clustered into different sub-clusters. For the first time, we proposed an attribute hallucination framework named CLIP-Cluster to address this issue, which first hallucinates multiple representations for different attributes with the powerful CLIP model and then pools them by learning neighbor-adaptive attention. Specifically, CLIP-Cluster first introduces a text-driven attribute hallucination module, which allows one to use natural language as the interface to hallucinate novel attributes for a given face image based on the well-aligned image-language CLIP space. Furthermore, we develop a neighbor-aware proxy generator that fuses the features describing various attributes into a proxy feature to build a bridge among different sub-clusters and reduce the intra-class variance. The proxy feature is generated by adaptively attending to the hallucinated visual features and the source one based on the local neighbor information. On this basis, a graph built with the proxy representations is used for subsequent clustering operations. Extensive experiments show our proposed approach outperforms state-of-the-art face clustering methods with high inference efficiency.
引用
收藏
页码:20729 / 20738
页数:10
相关论文
共 24 条
  • [1] CLIP-guided continual novel class discovery
    Yana, Qingsen
    Yang, Yiting
    Dai, Yutong
    Zhang, Xing
    Wiltos, Katarzyna
    Wozniak, Marcin
    Dong, Wei
    Zhang, Yanning
    KNOWLEDGE-BASED SYSTEMS, 2025, 310
  • [2] Image-Based CLIP-Guided Essence Transfer
    Chefer, Hila
    Benaim, Sagie
    Paiss, Roni
    Wolf, Lior
    COMPUTER VISION, ECCV 2022, PT XIII, 2022, 13673 : 695 - 711
  • [3] Multimodal Fake News Detection via CLIP-Guided Learning
    Zhou, Yangming
    Yang, Yuzhou
    Ying, Qichao
    Qian, Zhenxing
    Zhang, Xinpeng
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2825 - 2830
  • [4] StyleGAN-based CLIP-guided Image Shape Manipulation
    Qian, Yuchen
    Yamamoto, Kohei
    Yanai, Keiji
    19TH INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING, CBMI 2022, 2022, : 162 - 166
  • [5] CLIP-Guided Federated Learning on Heterogeneous and Long-Tailed Data
    Shi, Jiangming
    Zheng, Shanshan
    Yin, Xiangbo
    Lu, Yang
    Xie, Yuan
    Qu, Yanyun
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 13, 2024, : 14955 - 14963
  • [6] CLIP-guided black-box domain adaptation of image classification
    Tian, Liang
    Ye, Mao
    Zhou, Lihua
    He, Qichen
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (05) : 4637 - 4646
  • [7] StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators
    Gal, Rinon
    Patashnik, Or
    Maron, Haggai
    Bermano, Amit H.
    Chechik, Gal
    Cohen-Or, Daniel
    ACM TRANSACTIONS ON GRAPHICS, 2022, 41 (04):
  • [8] RAVE: Residual Vector Embedding for CLIP-Guided Backlit Image Enhancement
    Gaintseva, Tatiana
    Benning, Martin
    Slabaugh, Gregory
    COMPUTER VISION - ECCV 2024, PT LXXIX, 2025, 15137 : 412 - 428
  • [9] CLIP-guided Prototype Modulating for Few-shot Action Recognition
    Wang, Xiang
    Zhang, Shiwei
    Cen, Jun
    Gao, Changxin
    Zhang, Yingya
    Zhao, Deli
    Sang, Nong
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (06) : 1899 - 1912
  • [10] CgT-GAN: CLIP-guided Text GAN for Image Captioning
    Yu, Jiarui
    Li, Haoran
    Hao, Yanbin
    Zhu, Bin
    Xu, Tong
    He, Xiangnan
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 2252 - 2263