From Patch to Pixel: A Transformer-Based Hierarchical Framework for Compressive Image Sensing

被引：12

作者：

Gan, Hongping ^{[1
,2
]}

Shen, Minghe ^{[1
,2
]}

Hua, Yi ^{[3
]}

Ma, Chunyan ^{[1
]}

Zhang, Tao ^{[4
]}

机构：

[1] Northwestern Polytech Univ, Sch Software, Taicang 215400, Peoples R China

[2] Northwestern Polytech Univ, Yangtze River Delta Res Inst, Taicang 215400, Peoples R China

[3] Northwestern Polytech Univ, Sch Aeronaut, Xian 710072, Peoples R China

[4] Shanghai Jiao Tong Univ, Shanghai Key Lab Intelligent Sensing & Recognit, Shanghai 200240, Peoples R China

来源：

IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING | 2023年 / 9卷

基金：

中国国家自然科学基金;

关键词：

Image reconstruction; Transformers; Imaging; Decoding; Computer architecture; Sensors; Image coding; Compressive sensing; image reconstruction; patch-to-pixel; transformer; RECONSTRUCTION; NETWORK;

D O I：

10.1109/TCI.2023.3244396

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The convolutional neural network (CNN)-based reconstruction methods have dominated the compressive sensing (CS) in recent years. However, existing CNN-based approaches show potential restrictions in capturing non-local similarity of images, because of the intrinsic characteristic of convolutional layers, i.e., locality and weight sharing. In parallel, the emerging Transformer architecture shows fine capacity in modeling long-distance correlations onto embedded tokens for language and images. Yet vanilla Transformer does not exceed CNN-based networks considerably but shows roughly comparable performance, and the culprit can be the missing of sophisticated inductive bias regarding the local image structures. In this article, to eliminate the restrictions of the aforementioned paradigms, we propose a Transformer-based hierarchical framework, dubbed TCS-Net, for compressive image sensing (or image compressive sensing) with a patch-to-pixel manner. Concretely, the proposed TCS-Net consists of an image acquisition module and a reconstruction module (includes two key decoding phases: a patch-wise decoding phase and a pixel-wise decoding phase). The acquisition module can implement data-driven image sampling by jointly learning with the decoding phases. By adjusting the Transformer architecture to the patch-to-pixel multi-stage pattern, our reconstruction module can gradually decode the CS measurements from the patch-wise outlines to the pixel-wise textures, thereby building a high-precision mapping for image reconstruction. Extensive experiments on several datasets verify that the proposed TCS-Net outperforms existing state-of-the-art image CS methods by considerable margins.

引用

页码：133 / 146

页数：14

共 50 条

[31] TRANSFORMER-BASED HIERARCHICAL CLUSTERING FOR BRAIN NETWORK ANALYSIS
Dai, Wei
Cui, Hejie
Kan, Xuan
Guo, Ying
Van Rooij, Sanne
Yang, Carl
[J]. 2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
[32] A Sparse Transformer-Based Approach for Image Captioning
Lei, Zhou
Zhou, Congcong
Chen, Shengbo
Huang, Yiyong
Liu, Xianrui
[J]. IEEE ACCESS, 2020, 8 : 213437 - 213446
[33] A Sparse Transformer-Based Approach for Image Captioning
Lei, Zhou
Zhou, Congcong
Chen, Shengbo
Huang, Yiyong
Liu, Xianrui
[J]. IEEE Access, 2020, 8 : 213437 - 213446
[34] Transformer-based Extraction of Deep Image Models
Battis, Verena
Penner, Alexander
[J]. 2022 IEEE 7TH EUROPEAN SYMPOSIUM ON SECURITY AND PRIVACY (EUROS&P 2022), 2022, : 320 - 336
[35] A Review of Transformer-Based Approaches for Image Captioning
Ondeng, Oscar
Ouma, Heywood
Akuon, Peter
[J]. APPLIED SCIENCES-BASEL, 2023, 13 (19):
[36] ThaiTC:Thai Transformer-based Image Captioning
Jaknamon, Teetouch
Marukatat, Sanparith
[J]. 2022 17TH INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING (ISAI-NLP 2022) / 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INTERNET OF THINGS (AIOT 2022), 2022,
[37] A transformer-based Urdu image caption generation
Muhammad Hadi
Iqra Safder
Hajra Waheed
Farooq Zaman
Naif Radi Aljohani
Raheel Nawaz
Saeed Ul Hassan
Raheem Sarwar
[J]. Journal of Ambient Intelligence and Humanized Computing, 2024, 15 (9) : 3441 - 3457
[38] Transformer-based Residual Network for Hyperspectral Snapshot Compressive Reconstruction
Huang, Junru
Sun, Yubao
Wen, Jiaxuan
Liu, Qingshan
[J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 5075 - 5081
[39] HIPA: Hierarchical Patch Transformer for Single Image Super Resolution
Cai, Qing
Qian, Yiming
Li, Jinxing
Lyu, Jun
Yang, Yee-Hong
Wu, Feng
Zhang, David
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 3226 - 3237
[40] DEMD- based Image Compression Scheme in a Compressive Sensing Framework
Jha, Mithilesh Kumar
Lall, Brejesh
Roy, Sumantra Dutta
[J]. JOURNAL OF PATTERN RECOGNITION RESEARCH, 2014, 9 (01): : 64 - 78

← 1 2 3 4 5 →