A Multi-modal Graphical Model for Scene Analysis

被引:15
|
作者
Namin, Sarah Taghavi [1 ]
Najafi, Mohammad [1 ]
Salzmann, Mathieu [1 ]
Petersson, Lars [1 ]
机构
[1] Australian Natl Univ, NICTA, Canberra, ACT 0200, Australia
关键词
SEMANTIC SEGMENTATION;
D O I
10.1109/WACV.2015.139
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we introduce a multi-modal graphical model to address the problems of semantic segmentation using 2D-3D data exhibiting extensive many-to-one correspondences. Existing methods often impose a hard correspondence between the 2D and 3D data, where the 2D and 3D corresponding regions are forced to receive identical labels. This results in performance degradation due to misalignments, 3D-2D projection errors and occlusions. We address this issue by defining a graph over the entire set of data that models soft correspondences between the two modalities. This graph encourages each region in a modality to leverage the information from its corresponding regions in the other modality to better estimate its class label. We evaluate our method on a publicly available dataset and beat the state-of-the-art. Additionally, to demonstrate the ability of our model to support multiple correspondences for objects in 3D and 2D domains, we introduce a new multi-modal dataset, which is composed of panoramic images and LIDAR data, and features a rich set of many-to-one correspondences.
引用
收藏
页码:1006 / 1013
页数:8
相关论文
共 50 条
  • [1] A flexible graphical model for multi-modal parcellation of the cortex
    Parisot, Sarah
    Glocker, Ben
    Ktena, Sofia Ira
    Arslan, Salim
    Schirmer, Markus D.
    Rueckert, Daniel
    [J]. NEUROIMAGE, 2017, 162 : 226 - 248
  • [2] Co-inference for Multi-modal Scene Analysis
    Munoz, Daniel
    Bagnell, James Andrew
    Hebert, Martial
    [J]. COMPUTER VISION - ECCV 2012, PT VI, 2012, 7577 : 668 - 681
  • [3] Strategies for Multi-Modal Scene Exploration
    Bohg, Jeannette
    Johnson-Roberson, Matthew
    Bjorkman, Marten
    Kragic, Danica
    [J]. IEEE/RSJ 2010 INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2010), 2010, : 4509 - 4515
  • [4] Extracting a Background Image by a Multi-modal Scene Background Model
    Maddalena, Lucia
    Petrosino, Alfredo
    [J]. 2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 143 - 148
  • [5] Automatic multi-modal dialogue scene indexing
    Alatan, AA
    [J]. 2001 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS, 2001, : 374 - 377
  • [6] BLR: A Multi-modal Sentiment Analysis Model
    Yang Yang
    Ye Zhonglin
    Zhao Haixing
    Li Gege
    Cao Shujuan
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PART X, 2023, 14263 : 466 - 478
  • [7] Deep multi-modal data analysis and fusion for robust scene understanding in CAVs
    Papandreou, Andreas
    Kloukiniotis, Andreas
    Lalos, Aris
    Moustakas, Konstantinos
    [J]. IEEE MMSP 2021: 2021 IEEE 23RD INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2021,
  • [8] Comparative analysis of hidden Markov models for multi-modal dialogue scene indexing
    Alatan, AA
    Akansu, AN
    Wolf, W
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 2401 - 2404
  • [9] Heterogeneous Transfer Learning on Power Systems: A Merged Multi-Modal Gaussian Graphical Model
    Li, Haoran
    Weng, Yang
    Tong, Hanghang
    [J]. 20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2020), 2020, : 1088 - 1093
  • [10] Multi-modal Scene Categorization using Multi-tasks Learning
    Peng Xishuai
    Li Yuanxiang
    Luo Jianhua
    Xu Jun
    Lu Yongshuai
    [J]. PROCEEDINGS OF 2016 IEEE 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2016), 2016, : 1106 - 1111