Time-Frequency Mutual Learning for Moment Retrieval and Highlight Detection

被引:0
|
作者
Zhong, Yaokun [1 ]
Liang, Tianming [1 ]
Hu, Jian-Fang [1 ,2 ,3 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou, Peoples R China
[2] Guangdong Prov Key Lab Informat Secur Technol, Guangzhou, Peoples R China
[3] Minist Educ, Key Lab Machine Intelligence & Adv Comp, Guangzhou, Peoples R China
关键词
video moment retrieval; frequency-domain deep learning; deep mutual learning;
D O I
10.1007/978-981-97-8620-6_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Moment Retrieval and Highlight Detection (MR/HD) aims to concurrently retrieve relevant moments and predict clip-wise saliency scores according to a given textual query. Previous MR/HD works have overlooked explicit modeling of static-dynamic visual information described by the language query, which could lead to inaccurate predictions especially when the queried event describes both static appearances and dynamic motions. In this work, we consider learning the static interaction and dynamic reasoning from the time domain and frequency domain respectively, and propose a novel Time-Frequency Mutual Learning framework (TFML) which mainly consists of a time-domain branch, a frequency-domain branch, and a time-frequency aggregation branch. The time-domain branch learns to attend to the static visual information related to the textual query. In the frequency-domain branch, we introduce the Short-Time Fourier Transform (STFT) for dynamic modeling by attending to the frequency contents within varied segments. The time-frequency aggregation branch integrates the information from these two branches. To promote the mutual complementation of time-domain and frequency-domain information, we further employ a mutual learning strategy in concise and effective two-way loop, which enables the branches to collaboratively reason and achieve time-frequency consistent prediction. Extensive experiments on QVHighlights and TVSum demonstrate the effectiveness of our proposed framework as compared with state-of-the-art methods.
引用
收藏
页码:34 / 48
页数:15
相关论文
共 50 条
  • [21] Robust Phase Retrieval Algorithm for Time-Frequency Structured Measurements
    Pfander, Goetz E.
    Salanevich, Palina
    SIAM JOURNAL ON IMAGING SCIENCES, 2019, 12 (02): : 736 - 761
  • [22] Music Information Retrieval Algorithm Using Time-Frequency Dictionaries
    Thu, Soe Myat
    INFORMATICS ENGINEERING AND INFORMATION SCIENCE, PT II, 2011, 252 : 265 - 274
  • [23] Polarization Based Phase Retrieval for Time-Frequency Structured Measurements
    Salanevich, Palina
    Pfander, Goetz E.
    2015 INTERNATIONAL CONFERENCE ON SAMPLING THEORY AND APPLICATIONS (SAMPTA), 2015, : 187 - 191
  • [24] Subtask Prior-Driven Optimized Mechanism on Joint Video Moment Retrieval and Highlight Detection
    Zhou, Siyu
    Zhang, Fuwei
    Wang, Ruomei
    Zhou, Fan
    Su, Zhuo
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (11) : 11271 - 11285
  • [25] GPTSee: Enhancing Moment Retrieval and Highlight Detection via Description-Based Similarity Features
    Sun, Yunzhuo
    Xu, Yifang
    Xie, Zien
    Shu, Yukun
    Du, Sidan
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 521 - 525
  • [26] LEARNING STRUCTURED SPARSITY FOR TIME-FREQUENCY RECONSTRUCTION
    Jiang, Lei
    Zhang, Haijian
    Yu, Lei
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 5398 - 5402
  • [27] Abrupt changes detection in the time-frequency plane
    Laurent, H
    Doncarli, C
    PROCEEDINGS OF THE IEEE-SP INTERNATIONAL SYMPOSIUM ON TIME-FREQUENCY AND TIME-SCALE ANALYSIS, 1996, : 285 - 288
  • [28] Time-frequency radar processing for meteor detection
    Wen, CH
    Doherty, JF
    Mathews, JD
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2004, 42 (03): : 501 - 510
  • [29] UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection
    Liu, Ye
    Li, Siyuan
    Wu, Yang
    Chen, Chang Wen
    Shan, Ying
    Qie, Xiaohu
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3032 - 3041
  • [30] Time-frequency code detection in UWB system
    Choi, Sung-Woo
    Choi, Sang-Sung
    10TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY, VOLS I-III: INNOVATIONS TOWARD FUTURE NETWORKS AND SERVICES, 2008, : 2111 - 2115