Video Visual Relation Detection With Contextual Knowledge Embedding

被引:0
|
作者
Cao, Qianwen [1 ,2 ]
Huang, Heyan [1 ,2 ]
机构
[1] Beijing Inst Technol, Sch Comp, Beijing 100081, Peoples R China
[2] Beijing Engn Res Ctr High Volume Language Informat, Beijing 100081, Peoples R China
基金
中国国家自然科学基金;
关键词
Knowledge engineering; Computer vision; knowledge embedding; video understanding; video visual relation detection; visual relation tagging;
D O I
10.1109/TKDE.2023.3270328
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video visual relation detection (VidVRD) aims at abstracting structured relations in the form of $< $<subject-predicate-object$>$> from videos. The triple formation makes the search space extremely huge and the distribution unbalanced. Usually, existing works predict the relationships from visual, spatial, and semantic cues. Among them, semantic cues are responsible for exploring the semantic connections between objects, which is crucial to transfer knowledge across relations. However, most of these works extract semantic cues via simply mapping the object labels to classified features, which ignore the contextual surroundings, resulting in poor performance for low-frequency relations. To alleviate these issues, we propose a novel network, termed Contextual Knowledge Embedded Relation Network (CKERN), to facilitate VidVRD through establishing contextual knowledge embeddings for detected object pairs in relations from two aspects: commonsense attributes and prior linguistic dependencies. Specifically, we take the pair as a query to extract relational facts in the commonsense knowledge base, then encode them to explicitly construct semantic surroundings for relations. In addition, the statistics of object pairs with different predicates distilled from large-scale visual relations are taken into account to represent the linguistic regularity of relations. Extensive experimental results on benchmark datasets demonstrate the effectiveness and robustness of our proposed model.
引用
收藏
页码:13083 / 13095
页数:13
相关论文
共 50 条
  • [21] Visual relationship detection with contextual information
    Li Y.
    Wang Y.
    Chen Z.
    Zhu Y.
    Computers, Materials and Continua, 2020, 63 (03): : 1575 - 1589
  • [22] Knowledge Graph Embedding with Multiple Relation Projections
    Do, Kien
    Truyen Tran
    Venkatesh, Svetha
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 332 - 337
  • [23] Relation Embedding with Dihedral Group in Knowledge Graph
    Xu, Canran
    Li, Ruijiang
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 263 - 272
  • [24] Knowledge Graph Embedding with Hierarchical Relation Structure
    Zhang, Zhao
    Zhuang, Fuzhen
    Qu, Meng
    Lin, Fen
    He, Qing
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3198 - 3207
  • [25] Online video visual relation detection with hierarchical multi-modal fusion
    He, Yuxuan
    Gan, Ming-Gang
    Ma, Qianzhao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (24) : 65707 - 65727
  • [26] Video Visual Relation Detection via Multi-modal Feature Fusion
    Sun, Xu
    Ren, Tongwei
    Zi, Yuan
    Wu, Gangshan
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 2657 - 2661
  • [27] An Investigation of Contextual Features for Misleading Video Detection
    Li, Xiao-Jun
    Fan, Shuai
    Yao, Jun-Ping
    Li, Shao-Chen
    Journal of Network Intelligence, 2023, 8 (03): : 1008 - 1018
  • [28] Comparative Research on Embedding Methods for Video Knowledge Graph
    Zhou, Zhihong
    Xu, Qiang
    Ding, Hui
    Ji, Shengwei
    2023 2ND ASIA CONFERENCE ON ALGORITHMS, COMPUTING AND MACHINE LEARNING, CACML 2023, 2023, : 264 - 270
  • [29] A Multi-model Recurrent Knowledge Graph Embedding for Contextual Recommendations
    Kotzaitsis, Dionisis
    Koloniari, Georgia
    WEB ENGINEERING, ICWE 2024, 2024, 14629 : 99 - 114
  • [30] Contextual Propagation of Properties for Knowledge Graphs A Sentence Embedding Based Approach
    Paris, Pierre-Henri
    Hamdi, Faycal
    Niraula, Nobal
    Cherfi, Samira Si-said
    SEMANTIC WEB - ISWC 2020, PT I, 2020, 12506 : 494 - 510