Representation Learning Based on Vision Transformer

被引：0

作者：

Ran, Ruisheng ^{[1
]}

Gao, Tianyu ^{[1
]}

Hu, Qianwei ^{[2
]}

Zhang, Wenfeng ^{[1
]}

Peng, Shunshun ^{[1
]}

Fang, Bin ^{[3
]}

机构：

[1] Chongqing Normal Univ, Coll Comp & Informat Sci, Chongqing 401331, Peoples R China

[2] Chongqing Dinghui Informat Technol Co Ltd, Chongqing 401147, Peoples R China

[3] Chongqing Univ, Coll Comp Sci, Chongqing 400044, Peoples R China

来源：

INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE | 2024年 / 38卷 / 07期

基金：

中国国家自然科学基金;

关键词：

Representation learning; Transformer; data visualization; image reconstruction; zero-shot learning; DEEP; DIMENSIONALITY; NETWORK;

D O I：

10.1142/S0218001424590043

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, with the rapid development of information technology, the volume of image data has grown exponentially. However, these datasets typically contain a large amount of redundant information. To extract effective features and reduce redundancy from images, a representation learning method based on the Vision Transformer (ViT) has been proposed, and to our best knowledge, Transformer was first applied to zero-shot learning (ZSL). The method adopts a symmetric encoder-decoder structure, where the encoder incorporates Multi-Head Self-Attention (MSA) mechanism of ViT to reduce the dimensionality of image features, eliminate redundant information, and decrease computational burden. Consequently, it effectively extracts features, and the decoder is utilized for reconstructing image data. We evaluated the representation learning capability of the proposed method in various tasks, including data visualization, image reconstruction, face recognition, and ZSL. By comparing with state-of-the-art representation learning methods, the outstanding results obtained validate the effectiveness of this method in the field of representation learning.

引用

页数：23

共 50 条

[21] A Transformer-based Framework for Multivariate Time Series Representation Learning
Zerveas, George
Jayaraman, Srideepika
Patel, Dhaval
Bhamidipaty, Anuradha
Eickhoff, Carsten
[J]. KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 2114 - 2124
[22] Vision Transformer Adapters for Generalizable Multitask Learning
Bhattacharjee, Deblina
Susstrunk, Sabine
Salzmann, Mathieu
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 18969 - 18980
[23] Online Continual Learning with Contrastive Vision Transformer
Wang, Zhen
Liu, Liu
Kong, Yajing
Guo, Jiaxian
Tao, Dacheng
[J]. COMPUTER VISION, ECCV 2022, PT XX, 2022, 13680 : 631 - 650
[24] Binary representation learning in computer vision
Shen, Fumin
Yang, Yang
Zhang, Hanwang
[J]. NEUROCOMPUTING, 2016, 213 : 1 - 4
[25] A Multitask Learning-Based Vision Transformer for Plant Disease Localization and Classification
Hemalatha, S.
Jayachandran, Jai Jaganath Babu
[J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2024, 17 (01)
[26] UGTransformer: Unsupervised Graph Transformer Representation Learning
Xu, Lixiang
Liu, Haifeng
Cui, Qingzhe
Luo, Bin
Li, Ning
Chen, Yan
Tang, Yuanyan
[J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[27] Graph Propagation Transformer for Graph Representation Learning
Chen, Zhe
Tan, Hao
Wang, Tao
Shen, Tianrun
Lu, Tong
Peng, Qiuying
Cheng, Cheng
Qi, Yue
[J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 3559 - 3567
[28] A novel selective learning based transformer encoder architecture with enhanced word representation
Ansar, Wazib
Goswami, Saptarsi
Chakrabarti, Amlan
Chakraborty, Basabi
[J]. APPLIED INTELLIGENCE, 2023, 53 (08) : 9424 - 9443
[29] A novel selective learning based transformer encoder architecture with enhanced word representation
Wazib Ansar
Saptarsi Goswami
Amlan Chakrabarti
Basabi Chakraborty
[J]. Applied Intelligence, 2023, 53 : 9424 - 9443
[30] Molecular representation learning based on Transformer with fixed-length padding method
Wu, Yichu
Yang, Yang
Zhang, Ruimeng
Chen, Zijian
Jin, Meichen
Zou, Yi
Wang, Zhonghua
Wu, Fanhong
[J]. JOURNAL OF MOLECULAR STRUCTURE, 2025, 1319

← 1 2 3 4 5 →