Integrating vision and language: Semantic description of traffic events from image sequences

被引:0
|
作者
Hirano, Takashi [1 ]
Yoneyama, Shogo [1 ]
Okada, Yasuhiro [1 ]
Kosugi, Yukio [2 ]
机构
[1] Mitsubishi Electr Corp, Informat Technol R&D Ctr, Metsobishi Denki Bldg, Tokyo 100, Japan
[2] Tokyo Inst Technol, Interdisciplinary Grad Sch Sci Engn, Tokyo, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose an event extraction method from traffic image sequences. This method extracts moving objects and their trajectories from image sequences recorded by a stationary camera. These trajectories are mapped to 3D virtual space and physical parameters such as velocity and direction are estimated. After that, traffic events are extracted from these trajectories and physical parameters based on case-frame analysis in the field of natural language processing. Our method facilitates to describe events easily and detect general traffic events and abnormal situations. The experimental results of actual intersection traffic image sequence have shown the effectiveness of the method.
引用
收藏
页码:459 / +
页数:2
相关论文
共 50 条
  • [41] Extraction of 3D hand shape and posture from image sequences for sign language recognition
    Fillbrandt, H
    Akyol, S
    Kraiss, KF
    IEEE INTERNATIONAL WORKSHOP ON ANALYSIS AND MODELING OF FACE AND GESTURES, 2003, : 181 - 186
  • [42] Vision-Language Models in medical image analysis: From simple fusion to general large models
    Li, Xiang
    Li, Like
    Jiang, Yuchen
    Wang, Hao
    Qiao, Xinyu
    Feng, Ting
    Luo, Hao
    Zhao, Yong
    INFORMATION FUSION, 2025, 118
  • [43] SEMANTIC AND FORMAL IMAGE OF CONCEPTS RELATED TO ?SENSORY PERCEPTIONS? PRESERVED FROM THE INDO-EUROPEAN LANGUAGE IN THE DIALECT AREA
    Kumunts, Mher
    Margaryan, Inga
    Nersisyan, Lusine
    WISDOM, 2023, 25 (01): : 189 - 202
  • [44] DeepPTM: Protein Post-translational Modification Prediction from Protein Sequences by Combining Deep Protein Language Model with Vision Transformers
    Soylu, Necla Nisa
    Sefer, Emre
    CURRENT BIOINFORMATICS, 2024, 19 (09) : 810 - 824
  • [45] Preface: Workshop Semantics3D - Semantic Scene Analysis and 3D Reconstruction from Images and Image Sequences
    Rottensteiner, Franz
    Haala, Norbert
    Yang, Michael Ying
    International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences - ISPRS Archives, 2023, 48 (1/W2-2023): : 2001 - 2002
  • [46] Preface: Workshop Semantics3D - Semantic Scene Analysis and 3D Reconstruction from Images and Image Sequences
    Rottensteiner, Franz
    Haala, Norbert
    Yang, Michael Ying
    ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2023, 10 (1-W1-2023) : 1173 - 1174
  • [47] Video Description Generation Method Based on Contrastive Language-Image Pre-Training Combined Retrieval-Augmented and Multi-Scale Semantic Guidance
    Wang, Liang
    Hu, Yingying
    Xia, Zhouyong
    Chen, Enru
    Jiao, Meiqing
    Zhang, Mengxue
    Wang, Jun
    ELECTRONICS, 2025, 14 (02):
  • [48] Integrating natural language processing with image document analysis: what we learned from two real-world applications
    Chen, Jinying
    Cao, Huaigu
    Natarajan, Premkumar
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2015, 18 (03) : 235 - 247
  • [49] Integrating natural language processing with image document analysis: what we learned from two real-world applications
    Jinying Chen
    Huaigu Cao
    Premkumar Natarajan
    International Journal on Document Analysis and Recognition (IJDAR), 2015, 18 : 235 - 247
  • [50] Semantic Abstraction: Open-World 3D Scene Understanding from 2D Vision-Language Models
    Ha, Huy
    Song, Shuran
    CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 643 - 653