Anchor-based Detection for Natural Language Localization in Ego-centric Videos

被引:1
|
作者
Liu, Bei [1 ]
Zheng, Sipeng [2 ]
Fu, Jianlong [1 ]
Cheng, Wen-Huang [3 ]
机构
[1] Microsoft Res Asia, Beijing, Peoples R China
[2] Renmin Univ China, Beijing, Peoples R China
[3] Natl Yang Ming Chiao Tung Univ, Hsinchu, Taiwan
关键词
Embodied AI; ego-centric video; cross-modality; video understanding;
D O I
10.1109/ICCE56470.2023.10043460
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The Natural Language Localization (NLL) task aims to localize a sentence in a video with starting and ending timestamps. It requires a comprehensive understanding of both language and videos. We have seen a lot of work conducted for third-person view videos, while the task on ego-centric videos is still under-explored, which is critical for the understanding of increasing ego-centric videos and further facilitating embodied AI tasks. Directly adapting existing methods of NLL to egocentric video datasets is challenging due to two reasons. Firstly, there is a temporal duration gap between different datasets. Secondly, queries in ego-centric videos usually require a better understanding of more complex and long-term temporal orders. For the above reason, we propose an anchor-based detection model for NLL in ego-centric videos.
引用
收藏
页数:4
相关论文
共 50 条
  • [41] Trellis wire reconstruction byline anchor-based detection with vertical stereo vision
    Kok, Eugene
    Liu, Tianhao
    Chen, Chao
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2025, 231
  • [42] Adaptive Training Strategies for Small Object Detection Using Anchor-Based Detectors
    Zhang, Shenmeng
    Sun, Yongqing
    Su, Jia
    Gan, Guoxi
    Wen, Zonghui
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VII, 2023, 14260 : 28 - 39
  • [43] Anchor-Based Localization via Interval Analysis for Mobile Ad-Hoc Sensor Networks
    Mourad, Farah
    Snoussi, Hichem
    Abdallah, Fahed
    Richard, Cedric
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2009, 57 (08) : 3226 - 3239
  • [44] Anchor-Based Active Set for User-Centric Multi-Connectivity: Mobility Enhancement and Performance Evaluation
    He, Yuan
    Huang, Wanqing
    Zhang, Hongtao
    IEEE ACCESS, 2019, 7 : 107659 - 107668
  • [45] Performance Evaluation for Local Anchor-Based Dual Connectivity in 5G User-Centric Network
    Zhang, Hongtao
    Meng, Na
    Liu, Yang
    Zhang, Xing
    IEEE ACCESS, 2016, 4 : 5721 - 5729
  • [46] SP-Det: Anchor-based lane detection network with structural prior perception
    Sun, Libo
    Zhu, Hangyu
    Qin, Wenhu
    PATTERN RECOGNITION LETTERS, 2025, 188 : 60 - 66
  • [47] Dynamic adjustment of hyperparameters for anchor-based detection of objects with large image size differences
    Deng, Ying
    Hu, Xinliang
    Teng, Da
    Li, Bing
    Zhang, Congxuan
    Hu, Weiming
    PATTERN RECOGNITION LETTERS, 2023, 167 : 196 - 203
  • [48] An anchor-based convolutional network for the near-surface camouflaged personnel detection of UAVs
    Xu, Bin
    Wang, Congqing
    Liu, Yang
    Zhou, Yongjun
    VISUAL COMPUTER, 2024, 40 (03): : 1659 - 1671
  • [49] CCLane: Concise Curve Anchor-Based Lane Detection Model with MLP-Mixer
    Yang, Fan
    Zhao, Yanan
    Gao, Li
    Tan, Huachun
    Liu, Weijin
    Chen, Xue-mei
    Yang, Shijuan
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT III, 2024, 14427 : 376 - 387
  • [50] An anchor-based convolutional network for the near-surface camouflaged personnel detection of UAVs
    Bin Xu
    Congqing Wang
    Yang Liu
    Yongjun Zhou
    The Visual Computer, 2024, 40 : 1659 - 1671