Integrating vision and language: Semantic description of traffic events from image sequences

被引:0
|
作者
Hirano, Takashi [1 ]
Yoneyama, Shogo [1 ]
Okada, Yasuhiro [1 ]
Kosugi, Yukio [2 ]
机构
[1] Mitsubishi Electr Corp, Informat Technol R&D Ctr, Metsobishi Denki Bldg, Tokyo 100, Japan
[2] Tokyo Inst Technol, Interdisciplinary Grad Sch Sci Engn, Tokyo, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose an event extraction method from traffic image sequences. This method extracts moving objects and their trajectories from image sequences recorded by a stationary camera. These trajectories are mapped to 3D virtual space and physical parameters such as velocity and direction are estimated. After that, traffic events are extracted from these trajectories and physical parameters based on case-frame analysis in the field of natural language processing. Our method facilitates to describe events easily and detect general traffic events and abnormal situations. The experimental results of actual intersection traffic image sequence have shown the effectiveness of the method.
引用
收藏
页码:459 / +
页数:2
相关论文
共 50 条
  • [1] Semantic Understanding of Traffic Scenes with Large Vision Language Models
    Jain, Sandesh
    Thapa, Surendrabikram
    Chen, Kuan-Ting
    Abbott, A. Lynn
    Sarkar, Abhijit
    2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 1580 - 1587
  • [2] Semantic Understanding of Traffic Scenes with Large Vision Language Models
    Jain, Sandesh
    Thapa, Surendrabikram
    Chen, Kuan-Ting
    Abbott, A. Lynn
    Sarkar, Abhijit
    IEEE Intelligent Vehicles Symposium, Proceedings, 2024, : 1580 - 1587
  • [3] Semantic Description and Recognition of Human Body Poses and Movement Sequences with Gesture Description Language
    Hachaj, Tomasz
    Ogiela, Marek R.
    COMPUTER APPLICATIONS FOR BIO-TECHNOLOGY, MULTIMEDIA, AND UBIQUITOUS CITY, 2012, 353 : 1 - 8
  • [4] Geometric and semantic analysis of road image sequences for traffic scene construction
    Li, Yaochen
    Zhu, Chao
    Liu, Yuehu
    Hong, Yuhui
    Wang, Jianji
    NEUROCOMPUTING, 2021, 465 : 336 - 349
  • [5] Natural language description of image sequences as a form of knowledge representation
    Nagel, HH
    KI-99: ADVANCES IN ARTIFICIAL INTELLIGENCE, 1999, 1701 : 45 - 60
  • [6] Integrating Vision-Language Semantic Graphs in Multi-View Clustering
    Ke, JunLong
    Wen, Zichen
    Yang, Yechenhao
    Cui, Chenhao
    Ren, Yazhou
    Pu, Xiaorong
    He, Lifang
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 4273 - 4281
  • [7] A VISION OF VISION AND LANGUAGE COMPRISES ACTION - AN EXAMPLE FROM ROAD TRAFFIC
    NAGEL, HH
    ARTIFICIAL INTELLIGENCE REVIEW, 1994, 8 (2-3) : 189 - 214
  • [8] From text to image: challenges in integrating vision into ChatGPT for medical image interpretation
    Koga, Shunsuke
    Du, Wei
    NEURAL REGENERATION RESEARCH, 2025, 20 (02) : 487 - 488
  • [9] From text to image: challenges in integrating vision into ChatGPT for medical image interpretation
    Shunsuke Koga
    Wei Du
    NeuralRegenerationResearch, 2025, 20 (02) : 487 - 488
  • [10] IFSeg: Image-free Semantic Segmentation via Vision-Language Model
    Yun, Sukmin
    Park, Seong Hyeon
    Seo, Paul Hongsuck
    Shin, Jinwoo
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2967 - 2977