Learning Semantic Alignment Using Global Features and Multi-Scale Confidence

被引:0
|
作者
Xu, Huaiyuan [1 ]
Liao, Jing [2 ]
Liu, Huaping [3 ]
Sun, Yuxiang [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Mech Engn, Kowloon, Hong Kong, Peoples R China
[2] City Univ Hong Kong, Dept Comp Sci, Kowloon, Hong Kong, Peoples R China
[3] Tsinghua Univ, Inst Artificial Intelligence, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantics; Correlation; Feature extraction; Transformers; Training; Task analysis; Probabilistic logic; Semantic alignment; enhancement transformer; probabilistic correlation computation; cross-domain alignment;
D O I
10.1109/TCSVT.2023.3288370
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Semantic alignment aims to establish pixel correspondences between images based on semantic consistency. It can serve as a fundamental component for various downstream computer vision tasks, such as style transfer and exemplar-based colorization, etc. Many existing methods use local features and their cosine similarities to infer semantic alignment. However, they struggle with significant intra-class variation of objects, such as appearance, size, etc. In other words, contents with the same semantics tend to be significantly different in vision. To address this issue, we propose a novel deep neural network of which the core lies in global feature enhancement and adaptive multi-scale inference. Specifically, two modules are proposed: an enhancement transformer for enhancing semantic features with global awareness; a probabilistic correlation module for adaptively fusing multi-scale information based on the learned confidence scores. We use the unified network architecture to achieve two types of semantic alignment, namely, cross-object semantic alignment and cross-domain semantic alignment. Experimental results demonstrate that our method achieves competitive performance on five standard cross-object semantic alignment benchmarks, and outperforms the state of the arts in cross-domain semantic alignment.
引用
收藏
页码:897 / 910
页数:14
相关论文
共 50 条
  • [1] Improved RGBD Semantic Segmentation Using Multi-Scale Features
    Gao, Xiaoning
    Cai, Meng
    Li, Jianxun
    PROCEEDINGS OF THE 30TH CHINESE CONTROL AND DECISION CONFERENCE (2018 CCDC), 2018, : 3531 - 3536
  • [2] FACE ALIGNMENT BASED ON THE MULTI-SCALE LOCAL FEATURES
    Geng, Cong
    Jiang, Xudong
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 1517 - 1520
  • [3] Multi-scale Global Reasoning Unit for Semantic Segmentation
    Domae, Yukihiro
    Aizawa, Hiroaki
    Kato, Kunihito
    FRONTIERS OF COMPUTER VISION, IW-FCV 2021, 2021, 1405 : 46 - 56
  • [4] Composite Sketch Recognition Using Multi-scale Hog Features and Semantic Attributes
    Xue, Xinying
    Xu, Jiayi
    Mao, Xiaoyang
    2019 INTERNATIONAL CONFERENCE ON CYBERWORLDS (CW), 2019, : 121 - 127
  • [5] Composite Sketch Recognition Using Multi-Scale HOG Features and Semantic Attributes
    Xu J.
    Xue X.
    Li J.
    Mao X.
    Mao, Xiaoyang (mao@yamanashi.ac.jp), 1600, Institute of Computing Technology (32): : 297 - 304
  • [6] Multi-Scale Saliency Using Local Gradient and Global Colour Features
    Cooley, Christopher
    Coleman, Sonya
    Gardiner, Bryan
    Scotney, Bryan
    2019 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND PATTERN RECOGNITION (AIPR 2019), 2019, : 28 - 32
  • [7] Learning multi-scale features for foreground segmentation
    Lim, Long Ang
    Keles, Hacer Yalim
    PATTERN ANALYSIS AND APPLICATIONS, 2020, 23 (03) : 1369 - 1380
  • [8] Learning multi-scale features for foreground segmentation
    Long Ang Lim
    Hacer Yalim Keles
    Pattern Analysis and Applications, 2020, 23 : 1369 - 1380
  • [9] Learning Multi-Scale Features Using Dilated Convolution for Contour Detection
    Zhao, Haojun
    Lin, Chuan
    Li, Fuzhang
    Xie, Yongsheng
    Wu, Lingmei
    IEEE ACCESS, 2023, 11 : 64282 - 64293
  • [10] Multi-scale Semantic Segmentation Enriched Features for Pedestrian Detection
    Xie, Xiaolu
    Wang, Zengfu
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 2196 - 2201