Relation-Aware Distribution Representation Network for Person Clustering With Multiple Modalities

被引:0
|
作者
Liu, Kaijian [1 ]
Tang, Shixiang [2 ]
Li, Ziyue [3 ,4 ]
Li, Zhishuai [1 ]
Bai, Lei [5 ]
Zhu, Feng [1 ]
Zhao, Rui [1 ,6 ]
机构
[1] SenseTime Res, Shanghai 200030, Peoples R China
[2] Univ Sydney, Sydney, NSW 2050, Australia
[3] Univ Cologne, D-50923 Cologne, Germany
[4] EWI gGmbH, D-50827 Cologne, Germany
[5] Shanghai AI Lab, Shanghai 200030, Peoples R China
[6] Shanghai Jiao Tong Univ, Qing Yuan Res Inst, Shanghai 200040, Peoples R China
关键词
Faces; Feature extraction; Streaming media; Task analysis; Measurement; Semantics; Motion pictures; Person clustering; Multi-modality clues; Distribution learning; Multi-modal representations; AFFINITY GRAPH; MULTIVIEW; FACES; RANK;
D O I
10.1109/TMM.2023.3304454
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Person clustering with multi-modal clues, including faces, bodies, and voices, is critical for various tasks, such as movie parsing and identity-based movie editing. Related methods such as multi-view clustering mainly project multi-modal features into a joint feature space. However, multi-modal clue features are usually rather weakly correlated due to the semantic gap from the modality-specific uniqueness. As a result, these methods are not suitable for person clustering. In this article, we propose a Relation-Aware Distribution representation Network (RAD-Net) to generate a distribution representation for multi-modal clues. The distribution representation of a clue is a vector consisting of the relation between this clue and all other clues from all modalities, thus being modality agnostic and good for person clustering. Accordingly, we introduce a graph-based method to construct distribution representation and employ a cyclic update policy to refine distribution representation progressively. Our method achieves substantial improvements of +6% and +8.2% in F-score on the Video Person-Clustering Dataset (VPCD) and VoxCeleb2 multi-view clustering dataset, respectively.
引用
收藏
页码:8371 / 8382
页数:12
相关论文
共 50 条
  • [1] RelationTrack: Relation-Aware Multiple Object Tracking With Decoupled Representation
    Yu, En
    Li, Zhuoling
    Han, Shoudong
    Wang, Hongwei
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 2686 - 2697
  • [2] RAN: A Relation-aware Network for Relation Extraction
    Li, Yile
    Gu, Xiaoyan
    Yue, Yinliang
    Wang, Zhuo
    Li, Bo
    Wang, Weiping
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [3] Selective arguments representation with dual relation-aware network for video situation recognition
    Liu W.
    He Q.
    Wang C.
    Peng Y.
    Xie S.
    [J]. Neural Computing and Applications, 2024, 36 (17) : 9945 - 9961
  • [4] Relation-aware aggregation network with auxiliary guidance for text-based person search
    Zeng, Pengpeng
    Jing, Shuaiqi
    Song, Jingkuan
    Fan, Kaixuan
    Li, Xiangpeng
    We, Liansuo
    Guo, Yuan
    [J]. World Wide Web, 2022, 25 (04) : 1565 - 1582
  • [5] Relation-aware aggregation network with auxiliary guidance for text-based person search
    Pengpeng Zeng
    Shuaiqi Jing
    Jingkuan Song
    Kaixuan Fan
    Xiangpeng Li
    Liansuo We
    Yuan Guo
    [J]. World Wide Web, 2022, 25 : 1565 - 1582
  • [6] Person Re-Identification Using Local Relation-Aware Graph Convolutional Network
    Lian, Yu
    Huang, Wenmin
    Liu, Shuang
    Guo, Peng
    Zhang, Zhong
    Durrani, Tariq S.
    [J]. SENSORS, 2023, 23 (19)
  • [7] Relation-aware aggregation network with auxiliary guidance for text-based person search
    Zeng, Pengpeng
    Jing, Shuaiqi
    Song, Jingkuan
    Fan, Kaixuan
    Li, Xiangpeng
    We, Liansuo
    Guo, Yuan
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2022, 25 (04): : 1565 - 1582
  • [8] A relation-aware representation approach for the question matching system
    Chen, Yanmin
    Chen, Enhong
    Zhang, Kun
    Liu, Qi
    Sun, Ruijun
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2024, 27 (02):
  • [9] Task Relation-aware Continual User Representation Learning
    Kim, Sein
    Lee, Namkyeong
    Kim, Donghyun
    Yang, Minchul
    Park, Chanyoung
    [J]. PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 1107 - 1119
  • [10] A relation-aware representation approach for the question matching system
    Yanmin Chen
    Enhong Chen
    Kun Zhang
    Qi Liu
    Ruijun Sun
    [J]. World Wide Web, 2024, 27