Learning Semantic Alignment Using Global Features and Multi-Scale Confidence

被引:0
|
作者
Xu, Huaiyuan [1 ]
Liao, Jing [2 ]
Liu, Huaping [3 ]
Sun, Yuxiang [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Mech Engn, Kowloon, Hong Kong, Peoples R China
[2] City Univ Hong Kong, Dept Comp Sci, Kowloon, Hong Kong, Peoples R China
[3] Tsinghua Univ, Inst Artificial Intelligence, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantics; Correlation; Feature extraction; Transformers; Training; Task analysis; Probabilistic logic; Semantic alignment; enhancement transformer; probabilistic correlation computation; cross-domain alignment;
D O I
10.1109/TCSVT.2023.3288370
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Semantic alignment aims to establish pixel correspondences between images based on semantic consistency. It can serve as a fundamental component for various downstream computer vision tasks, such as style transfer and exemplar-based colorization, etc. Many existing methods use local features and their cosine similarities to infer semantic alignment. However, they struggle with significant intra-class variation of objects, such as appearance, size, etc. In other words, contents with the same semantics tend to be significantly different in vision. To address this issue, we propose a novel deep neural network of which the core lies in global feature enhancement and adaptive multi-scale inference. Specifically, two modules are proposed: an enhancement transformer for enhancing semantic features with global awareness; a probabilistic correlation module for adaptively fusing multi-scale information based on the learned confidence scores. We use the unified network architecture to achieve two types of semantic alignment, namely, cross-object semantic alignment and cross-domain semantic alignment. Experimental results demonstrate that our method achieves competitive performance on five standard cross-object semantic alignment benchmarks, and outperforms the state of the arts in cross-domain semantic alignment.
引用
收藏
页码:897 / 910
页数:14
相关论文
共 50 条
  • [31] Cross-Sign Language Transfer Learning Using Domain Adaptation with Multi-scale Temporal Alignment
    Keren Artiaga
    Yang Li
    Ercan Engin Kuruoglu
    Wai Kin (Victor) Chan
    Multimedia Tools and Applications, 2024, 83 : 37025 - 37051
  • [32] Cross-Sign Language Transfer Learning Using Domain Adaptation with Multi-scale Temporal Alignment
    Artiaga, Keren
    Li, Yang
    Kuruoglu, Ercan Engin
    Chan, Wai Kin
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (12) : 37025 - 37051
  • [33] Semantic image segmentation using fully convolutional neural networks with multi-scale images and multi-scale dilated convolutions
    Duc My Vo
    Lee, Sang-Woong
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (14) : 18689 - 18707
  • [34] Semantic image segmentation using fully convolutional neural networks with multi-scale images and multi-scale dilated convolutions
    Duc My Vo
    Sang-Woong Lee
    Multimedia Tools and Applications, 2018, 77 : 18689 - 18707
  • [35] Fake News Detection via Multi-scale Semantic Alignment and Cross-modal Attention
    Wang, Jiandong
    Zhang, Hongguang
    Liu, Chun
    Yang, Xiongjun
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2406 - 2410
  • [36] Multi-scale registration algorithm for alignment of meshes
    Vadde, S
    Kamarthi, SV
    Gupta, SM
    INTELLIGENT MANUFACTURING, 2004, 5263 : 113 - 119
  • [37] Multi-scale Alignment and Positioning System - MAPS
    Fesperman, Ronnie
    Ozturk, Ozkan
    Hocken, Robert
    Ruben, Shalom
    Tsao, Tsu-Chin
    Phipps, James
    Lemmons, Tiffany
    Brien, John
    Caskey, Greg
    PRECISION ENGINEERING-JOURNAL OF THE INTERNATIONAL SOCIETIES FOR PRECISION ENGINEERING AND NANOTECHNOLOGY, 2012, 36 (04): : 517 - 537
  • [38] ADAPTIVE MULTI-SCALE SEMANTIC FUSION NETWORK FOR ZERO-SHOT LEARNING
    Song, Jing
    Peng, Peixi
    Zhai, Yunpeng
    Zhang, Chong
    Tian, Yonghong
    2021 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2021,
  • [39] Edge Guided GANs With Multi-Scale Contrastive Learning for Semantic Image Synthesis
    Tang, Hao
    Sun, Guolei
    Sebe, Nicu
    Van Gool, Luc
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 14435 - 14452
  • [40] GSTO: Gated Scale-Transfer Operation for Multi-Scale Feature Learning in Semantic Segmentation
    Wang, Zhuoying
    Wang, Yongtao
    Tang, Zhi
    Li, Yangyan
    Chen, Ying
    Ling, Haibin
    Lin, Weisi
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 7111 - 7118