Learning Semantic Alignment Using Global Features and Multi-Scale Confidence

被引：0

作者：

Xu, Huaiyuan ^{[1
]}

Liao, Jing ^{[2
]}

Liu, Huaping ^{[3
]}

Sun, Yuxiang ^{[1
]}

机构：

[1] Hong Kong Polytech Univ, Dept Mech Engn, Kowloon, Hong Kong, Peoples R China

[2] City Univ Hong Kong, Dept Comp Sci, Kowloon, Hong Kong, Peoples R China

[3] Tsinghua Univ, Inst Artificial Intelligence, Dept Comp Sci & Technol, Beijing 100084, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Semantics; Correlation; Feature extraction; Transformers; Training; Task analysis; Probabilistic logic; Semantic alignment; enhancement transformer; probabilistic correlation computation; cross-domain alignment;

D O I：

10.1109/TCSVT.2023.3288370

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Semantic alignment aims to establish pixel correspondences between images based on semantic consistency. It can serve as a fundamental component for various downstream computer vision tasks, such as style transfer and exemplar-based colorization, etc. Many existing methods use local features and their cosine similarities to infer semantic alignment. However, they struggle with significant intra-class variation of objects, such as appearance, size, etc. In other words, contents with the same semantics tend to be significantly different in vision. To address this issue, we propose a novel deep neural network of which the core lies in global feature enhancement and adaptive multi-scale inference. Specifically, two modules are proposed: an enhancement transformer for enhancing semantic features with global awareness; a probabilistic correlation module for adaptively fusing multi-scale information based on the learned confidence scores. We use the unified network architecture to achieve two types of semantic alignment, namely, cross-object semantic alignment and cross-domain semantic alignment. Experimental results demonstrate that our method achieves competitive performance on five standard cross-object semantic alignment benchmarks, and outperforms the state of the arts in cross-domain semantic alignment.

引用

页码：897 / 910

页数：14

共 50 条

[1] Improved RGBD Semantic Segmentation Using Multi-Scale Features
Gao, Xiaoning
Cai, Meng
Li, Jianxun
PROCEEDINGS OF THE 30TH CHINESE CONTROL AND DECISION CONFERENCE (2018 CCDC), 2018, : 3531 - 3536
[2] FACE ALIGNMENT BASED ON THE MULTI-SCALE LOCAL FEATURES
Geng, Cong
Jiang, Xudong
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 1517 - 1520
[3] Multi-scale Global Reasoning Unit for Semantic Segmentation
Domae, Yukihiro
Aizawa, Hiroaki
Kato, Kunihito
FRONTIERS OF COMPUTER VISION, IW-FCV 2021, 2021, 1405 : 46 - 56
[4] Composite Sketch Recognition Using Multi-scale Hog Features and Semantic Attributes
Xue, Xinying
Xu, Jiayi
Mao, Xiaoyang
2019 INTERNATIONAL CONFERENCE ON CYBERWORLDS (CW), 2019, : 121 - 127
[5] Composite Sketch Recognition Using Multi-Scale HOG Features and Semantic Attributes
Xu J.
Xue X.
Li J.
Mao X.
Mao, Xiaoyang (mao@yamanashi.ac.jp), 1600, Institute of Computing Technology (32): : 297 - 304
[6] Multi-Scale Saliency Using Local Gradient and Global Colour Features
Cooley, Christopher
Coleman, Sonya
Gardiner, Bryan
Scotney, Bryan
2019 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND PATTERN RECOGNITION (AIPR 2019), 2019, : 28 - 32
[7] Learning multi-scale features for foreground segmentation
Lim, Long Ang
Keles, Hacer Yalim
PATTERN ANALYSIS AND APPLICATIONS, 2020, 23 (03) : 1369 - 1380
[8] Learning multi-scale features for foreground segmentation
Long Ang Lim
Hacer Yalim Keles
Pattern Analysis and Applications, 2020, 23 : 1369 - 1380
[9] Learning Multi-Scale Features Using Dilated Convolution for Contour Detection
Zhao, Haojun
Lin, Chuan
Li, Fuzhang
Xie, Yongsheng
Wu, Lingmei
IEEE ACCESS, 2023, 11 : 64282 - 64293
[10] Multi-scale Semantic Segmentation Enriched Features for Pedestrian Detection
Xie, Xiaolu
Wang, Zengfu
2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 2196 - 2201

← 1 2 3 4 5 →