Relation-aware Video Reading Comprehension for Temporal Language Grounding

被引:0
|
作者
Gao, Jialin [1 ,2 ]
Sun, Xin [1 ,2 ]
Xu, MengMeng [3 ]
Zhou, Xi [1 ,2 ]
Ghanem, Bernard [3 ]
机构
[1] Shanghai Jiao Tong Univ, Cooperat Medianet Innovat Ctr, Shanghai, Peoples R China
[2] CloudWalk Technol Co Ltd, Shanghai, Peoples R China
[3] King Abdullah Univ Sci & Technol, Thuwal, Saudi Arabia
关键词
LOCALIZATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Temporal language grounding in videos aims to localize the temporal span relevant to the given query sentence. Previous methods treat it either as a boundary regression task or a span extraction task. This paper will formulate temporal language grounding into video reading comprehension and propose a Relation-aware Network (RaNet) to address it. This framework aims to select a video moment choice from the predefined answer set with the aid of coarse-and-fine choice-query interaction and choice-choice relation construction. A choice-query interactor is proposed to match the visual and textual information simultaneously in sentence-moment and token-moment levels, leading to a coarse-and-fine cross-modal interaction. Moreover, a novel multi-choice relation constructor is introduced by leveraging graph convolution to capture the dependencies among video moment choices for the best choice selection. Extensive experiments on ActivityNet-Captions, TACoS, and CharadesSTA demonstrate the effectiveness of our solution. Codes will be available at https: //github.com/Huntersxsx/RaNet.
引用
下载
收藏
页码:3978 / 3988
页数:11
相关论文
共 50 条
  • [21] UniVTG: Towards Unified Video-Language Temporal Grounding
    Lin, Kevin Qinghong
    Zhang, Pengchuan
    Chen, Joya
    Pramanick, Shraman
    Gao, Difei
    Wang, Alex Jinpeng
    Yan, Rui
    Shou, Mike Zheng
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 2782 - 2792
  • [22] Language comprehension: Relevant brain systems and their temporal relation
    Friederici, Angela D.
    INTERNATIONAL JOURNAL OF PSYCHOLOGY, 2008, 43 (3-4) : 716 - 716
  • [23] DVD captioned video and second language reading/listening comprehension
    Markham, P
    EISTA '04: INTERNATIONAL CONFERENCE ON EDUCATION AND INFORMATION SYSTEMS: TECHNOLOGIES AND APPLICATIONS, VOL 3, PROCEEDINGS: EDUCATION/TRAINING AND INFORMATION/COMMUNICATION TECHNOLOGIES AND APPLICATIONS, 2004, : 174 - 178
  • [24] Synthesizing Relation-Aware Entity Transformation by Examples
    Wu, Jiarong
    Jiang, Yanyan
    Xu, Chang
    Cheung, Shing-Chi
    Ma, Xiaoxing
    Lu, Jian
    PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING - COMPANION (ICSE-COMPANION, 2018, : 266 - 267
  • [25] Relation-Aware Weighted Embedding for Heterogeneous Graphs
    Hu, Ganglin
    Pang, Jun
    INFORMATION TECHNOLOGY AND CONTROL, 2023, 52 (01): : 199 - 214
  • [26] Relation-aware Blocking for Scalable Recommendation Systems
    Liang, Huizhi
    Liu, Zehao
    Markchom, Thanet
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 4214 - 4218
  • [27] Automatic Relation-aware Graph Network Proliferation
    Cai, Shaofei
    Li, Liang
    Han, Xinzhe
    Luo, Jiebo
    Zha, Zheng-Jun
    Huang, Qingming
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 10853 - 10863
  • [28] Relation-Aware Transformer for Portfolio Policy Learning
    Xu, Ke
    Zhang, Yifan
    Ye, Deheng
    Zhao, Peilin
    Tan, Mingkui
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 4647 - 4653
  • [29] Relation-Aware Isosurface Extraction in Multifield Data
    Nagaraj, Suthambhara
    Natarajan, Vijay
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2011, 17 (02) : 182 - 191
  • [30] Relation-aware Heterogeneous Graph for User Profiling
    Yan, Qilong
    Zhang, Yufeng
    Liu, Qiang
    Wu, Shu
    Wang, Liang
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3573 - 3577