AlertMe: Towards Natural Language-Based Live Video Trigger Systems at the Edge

被引:0
|
作者
Ye, Angela Ning [1 ]
Hu, Zhiming [1 ]
Phillips, Caleb [1 ]
Mohomed, Iqbal [1 ]
机构
[1] Samsung AI Ctr, Toronto, ON, Canada
关键词
Edge Computing; Multimodal Learning;
D O I
10.1145/3434770.3459740
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Advances in deep learning have enabled brand new video analytics systems and applications. Existing systems research on real-time video event detection does not consider matching based on natural language; rather, it focuses on using Domain Specific Languages that define spatio-temporal operators on video streams for efficient matching. Alternatively, research in the multimodal AI community on joint understanding of video and language focuses on applications such as language-based video retrieval, where videos may have been processed offline. In this work, we propose AlertMe, a multimodal-based live video trigger system that matches incoming video streams to a set of user-defined natural language triggers. We dynamically select the optimal sliding window size to extract feature vectors from different modalities in near real time. We also describe our approach to achieve on-device deployment by introducing a profiler to select runtime-efficient feature extractors. Lastly, we show that limiting the number of trigger candidates can significantly increase event detection performance in applications such as task following in AR glasses.
引用
收藏
页码:67 / 72
页数:6
相关论文
共 50 条
  • [21] Deep Language-based Critiquing for Recommender Systems
    Wu, Ga
    Luo, Kai
    Sanner, Scott
    Soh, Harold
    RECSYS 2019: 13TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, 2019, : 137 - 145
  • [22] Language-based electronics systems design automation
    Peterson, GD
    IEEE COMPUTER SOCIETY WORKSHOP ON VLSI '98 - SYSTEM LEVEL DESIGN, PROCEEDINGS, 1998, : 130 - 135
  • [23] Relational interface for natural language-based information sources
    Kawasaki, Z
    Shibata, K
    Tajima, M
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2003, E86D (06) : 1139 - 1143
  • [24] Natural language-based automatic qualitative description of clusters
    Sanchez-Hernandez, German
    Agell, Nuria
    Carlos Aguado, Juan
    ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE OF THE CATALAN ASSOCIATION FOR ARTIFICIAL INTELLIGENCE, 2013, 256 : 225 - 234
  • [25] Natural Language-Based Automatic Programming for Industrial Robots
    Hu, Haiyang
    Chen, Jie
    Liu, Hanwen
    Li, Zhongjin
    Huang, Liguo
    JOURNAL OF GRID COMPUTING, 2022, 20 (03)
  • [26] Contrastive Learning for Natural Language-Based Vehicle Retrieval
    Tam Minh Nguyen
    Quang Huu Pham
    Linh Bao Doan
    Hoang Viet Trinh
    Viet-Anh Nguyen
    Viet-Hoang Phan
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 4240 - 4247
  • [27] Towards SOS Meta-Theory for Language-Based Security
    Mousavi, MohammadReza
    ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE, 2006, 162 : 267 - 271
  • [28] Towards a South African model of language-based learning disability
    Mazibuko, Xoli
    Flack, Penelope
    Kvalsvig, Jane
    SOUTH AFRICAN JOURNAL OF COMMUNICATION DISORDERS, 2019, 66 (01)
  • [29] Decidability and proof systems for language-based noninterference relations
    Dam, M
    ACM SIGPLAN NOTICES, 2006, 41 (01) : 67 - 78