Anchor-based Detection for Natural Language Localization in Ego-centric Videos

被引：1

作者：

Liu, Bei ^{[1
]}

Zheng, Sipeng ^{[2
]}

Fu, Jianlong ^{[1
]}

Cheng, Wen-Huang ^{[3
]}

机构：

[1] Microsoft Res Asia, Beijing, Peoples R China

[2] Renmin Univ China, Beijing, Peoples R China

[3] Natl Yang Ming Chiao Tung Univ, Hsinchu, Taiwan

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, ICCE | 2023年

关键词：

Embodied AI; ego-centric video; cross-modality; video understanding;

D O I：

10.1109/ICCE56470.2023.10043460

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The Natural Language Localization (NLL) task aims to localize a sentence in a video with starting and ending timestamps. It requires a comprehensive understanding of both language and videos. We have seen a lot of work conducted for third-person view videos, while the task on ego-centric videos is still under-explored, which is critical for the understanding of increasing ego-centric videos and further facilitating embodied AI tasks. Directly adapting existing methods of NLL to egocentric video datasets is challenging due to two reasons. Firstly, there is a temporal duration gap between different datasets. Secondly, queries in ego-centric videos usually require a better understanding of more complex and long-term temporal orders. For the above reason, we propose an anchor-based detection model for NLL in ego-centric videos.

引用

页数：4

共 50 条

[21] Analysis of error for anchor-based localization in wireless sensor networks
Abbas, Ash Mohammad
JOURNAL OF INTERDISCIPLINARY MATHEMATICS, 2020, 23 (02) : 393 - 401
[22] A New Anchor-based Localization Algorithm for Wireless Sensor Network
Wang Jianguo
Wang Zhongsheng
Zhang Ling
Shi Fei
Song Guohua
2011 TENTH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS TO BUSINESS, ENGINEERING AND SCIENCE (DCABES), 2011, : 239 - 243
[23] User-Centric Overlapped Clustering Based on Anchor-Based Precoding in Cellular Networks
Kang, Hyeon Su
Kim, Duk Kyung
IEEE COMMUNICATIONS LETTERS, 2016, 20 (03) : 542 - 545
[24] Anchor-based Robust Finetuning of Vision-Language Models
Han, Jinwei
Lin, Zhiwen
Sun, Zhongyisun
Gao, Yingguo
Yan, Ke
Ding, Shouhong
Gao, Yuan
Xia, Gui-Song
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 26909 - 26918
[25] Anchor-Based Three-Dimensional Localization Using Range Measurements
Wang, Yue
Xiong, Weiming
2012 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING (WICOM), 2012,
[26] Hand action detection from ego-centric depth sequences with error-correcting Hough transform
Xu, Chi
Govindarajan, Lakshmi Narasimhan
Cheng, Li
PATTERN RECOGNITION, 2017, 72 : 494 - 503
[27] A Light Weight Detection Network with Anchor-based Pooling Module
Huang, Zhendong
Chen, Chunlin
Wu, Qiong
Li, Weibing
Ding, Zhao
Ling, Qiang
PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 6380 - 6385
[28] HTC Vive as a Ground-Truth System for Anchor-Based Indoor Localization
Flueratoru, Laura
Lohan, Elena Simona
Nurmi, Jari
Niculescu, Dragos
2020 12TH INTERNATIONAL CONGRESS ON ULTRA MODERN TELECOMMUNICATIONS AND CONTROL SYSTEMS AND WORKSHOPS (ICUMT 2020), 2020, : 214 - 221
[29] An ego-centric citation analysis of the works of Michael O. Rabin based on multiple citation indexes
Bar-Ilan, Judit
INFORMATION PROCESSING & MANAGEMENT, 2006, 42 (06) : 1553 - 1566
[30] Path Planning Algorithm for Mobile Anchor-Based Localization in Wireless Sensor Networks
Ou, Chia-Ho
He, Wei-Lun
IEEE SENSORS JOURNAL, 2013, 13 (02) : 466 - 475

← 1 2 3 4 5 →