Social Context-aware Person Search in Videos via Multi-modal Cues

被引:5
|
作者
Li, Dan [1 ]
Xu, Tong [1 ]
Zhou, Peilun [1 ]
He, Weidong [1 ]
Hao, Yanbin [2 ]
Zheng, Yi [3 ]
Chen, Enhong [1 ]
机构
[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[2] City Univ Hong Kong, Hong Kong, Peoples R China
[3] Huawei Technol, Hangzhou, Zhejiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Person search; graph modeling; user profile; label propagation; social relation; neural network; LABEL PROPAGATION;
D O I
10.1145/3480967
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Person search has long been treated as a crucial and challenging task to support deeper insight in personalized summarization and personality discovery. Traditional methods, e.g., person re-identification and face recognition techniques, which profile video characters based on visual information, are often limited by relatively fixed poses or small variation of viewpoints and suffer from more realistic scenes with high motion complexity (e.g., movies). At the same time, long videos such as movies often have logical story lines and are composed of continuously developmental plots. In this situation, different persons usually meet on a specific occasion, in which informative social cues are performed. We notice that these social cues could semantically profile their personality and benefit person search task in two aspects. First, persons with certain relationships usually co-occur in short intervals; in case one of them is easier to be identified, the social relation cues extracted from their co-occurrences could further benefit the identification for the harder ones. Second, social relations could reveal the association between certain scenes and characters (e.g., classmate relationship may only exist among students), which could narrow down candidates into certain persons with a specific relationship. In this way, high-level social relation cues could improve the effectiveness of person search. Along this line, in this article, we propose a social context-aware framework, which fuses visual and social contexts to profile persons in more semantic perspectives and better deal with person search task in complex scenarios. Specifically, we first segment videos into several independent scene units and abstract out social contexts within these scene units. Then, we construct inner-personal links through a graph formulation operation for each scene unit, in which both visual cues and relation cues are considered. Finally, we perform a relation-aware label propagation to identify characters' occurrences, combining low-level semantic cues (i.e., visual cues) and high-level semantic cues (i.e., relation cues) to further enhance the accuracy. Experiments on real-world datasets validate that our solution outperforms several competitive baselines.
引用
收藏
页数:25
相关论文
共 50 条
  • [1] Things that see: Context-aware multi-modal interaction
    Crowley, James L.
    [J]. COGNITIVE VISION SYSTEMS: SAMPLING THE SPECTRUM OF APPROACHERS, 2006, 3948 : 183 - 198
  • [2] CONTEXT-AWARE DEEP LEARNING FOR MULTI-MODAL DEPRESSION DETECTION
    Lam, Genevieve
    Huang Dongyan
    Lin, Weisi
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3946 - 3950
  • [3] Multi-Modal Context-Aware reasoNer (CAN) at the Edge of IoT
    Rahman, Hasibur
    Rahmani, Rahim
    Kanter, Theo
    [J]. 8TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT-2017) AND THE 7TH INTERNATIONAL CONFERENCE ON SUSTAINABLE ENERGY INFORMATION TECHNOLOGY (SEIT 2017), 2017, 109 : 335 - 342
  • [4] SCATEAgent: Context-aware software agents for multi-modal travel
    Yin, M
    Griss, M
    [J]. APPLICATIONS OF AGENT TECHNOLOGY IN TRAFFIC AND TRANSPORTATION, 2005, : 69 - 84
  • [5] Adaptive Context-Aware Multi-Modal Network for Depth Completion
    Zhao, Shanshan
    Gong, Mingming
    Fu, Huan
    Tao, Dacheng
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 5264 - 5276
  • [6] Experiments with multi-modal interfaces in a context-aware city guide
    Bornträger, C
    Cheverst, K
    Davies, N
    Dix, A
    Friday, A
    Seitz, J
    [J]. HUMAN-COMPUTER INTERACTION WITH MOBILE DEVICES AND SERVICES, 2003, 2795 : 116 - 130
  • [7] Context-aware Interactive Attention for Multi-modal Sentiment and Emotion Analysis
    Chauhan, Dushyant Singh
    Akhtar, Md Shad
    Ekbal, Asif
    Bhattacharyya, Pushpak
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5647 - 5657
  • [8] Hydra: A Personalized and Context-Aware Multi-Modal Transportation Recommendation System
    Liu, Hao
    Tong, Yongxin
    Zhang, Panpan
    Lu, Xinjiang
    Duan, Jianguo
    Xiong, Hui
    [J]. KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 2314 - 2324
  • [9] Context-aware multi-modal route selection service for urban computing scenarios
    Brito, Matheus
    Santos, Camilo
    Martins, Bruno S.
    Medeiros, Iago
    Seruffo, Marcos
    Cerqueira, Eduardo
    Rosario, Denis
    [J]. AD HOC NETWORKS, 2024, 161
  • [10] RCAA: Relational Context-Aware Agents for Person Search
    Chang, Xiaojun
    Huang, Po-Yao
    Shen, Yi-Dong
    Liang, Xiaodan
    Yang, Yi
    Hauptmann, Alexander G.
    [J]. COMPUTER VISION - ECCV 2018, PT IX, 2018, 11213 : 86 - 102