From Patch to Pixel: A Transformer-Based Hierarchical Framework for Compressive Image Sensing

被引:12
|
作者
Gan, Hongping [1 ,2 ]
Shen, Minghe [1 ,2 ]
Hua, Yi [3 ]
Ma, Chunyan [1 ]
Zhang, Tao [4 ]
机构
[1] Northwestern Polytech Univ, Sch Software, Taicang 215400, Peoples R China
[2] Northwestern Polytech Univ, Yangtze River Delta Res Inst, Taicang 215400, Peoples R China
[3] Northwestern Polytech Univ, Sch Aeronaut, Xian 710072, Peoples R China
[4] Shanghai Jiao Tong Univ, Shanghai Key Lab Intelligent Sensing & Recognit, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
Image reconstruction; Transformers; Imaging; Decoding; Computer architecture; Sensors; Image coding; Compressive sensing; image reconstruction; patch-to-pixel; transformer; RECONSTRUCTION; NETWORK;
D O I
10.1109/TCI.2023.3244396
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The convolutional neural network (CNN)-based reconstruction methods have dominated the compressive sensing (CS) in recent years. However, existing CNN-based approaches show potential restrictions in capturing non-local similarity of images, because of the intrinsic characteristic of convolutional layers, i.e., locality and weight sharing. In parallel, the emerging Transformer architecture shows fine capacity in modeling long-distance correlations onto embedded tokens for language and images. Yet vanilla Transformer does not exceed CNN-based networks considerably but shows roughly comparable performance, and the culprit can be the missing of sophisticated inductive bias regarding the local image structures. In this article, to eliminate the restrictions of the aforementioned paradigms, we propose a Transformer-based hierarchical framework, dubbed TCS-Net, for compressive image sensing (or image compressive sensing) with a patch-to-pixel manner. Concretely, the proposed TCS-Net consists of an image acquisition module and a reconstruction module (includes two key decoding phases: a patch-wise decoding phase and a pixel-wise decoding phase). The acquisition module can implement data-driven image sampling by jointly learning with the decoding phases. By adjusting the Transformer architecture to the patch-to-pixel multi-stage pattern, our reconstruction module can gradually decode the CS measurements from the patch-wise outlines to the pixel-wise textures, thereby building a high-precision mapping for image reconstruction. Extensive experiments on several datasets verify that the proposed TCS-Net outperforms existing state-of-the-art image CS methods by considerable margins.
引用
收藏
页码:133 / 146
页数:14
相关论文
共 50 条
  • [31] TRANSFORMER-BASED HIERARCHICAL CLUSTERING FOR BRAIN NETWORK ANALYSIS
    Dai, Wei
    Cui, Hejie
    Kan, Xuan
    Guo, Ying
    Van Rooij, Sanne
    Yang, Carl
    [J]. 2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
  • [32] A Sparse Transformer-Based Approach for Image Captioning
    Lei, Zhou
    Zhou, Congcong
    Chen, Shengbo
    Huang, Yiyong
    Liu, Xianrui
    [J]. IEEE ACCESS, 2020, 8 : 213437 - 213446
  • [33] A Sparse Transformer-Based Approach for Image Captioning
    Lei, Zhou
    Zhou, Congcong
    Chen, Shengbo
    Huang, Yiyong
    Liu, Xianrui
    [J]. IEEE Access, 2020, 8 : 213437 - 213446
  • [34] Transformer-based Extraction of Deep Image Models
    Battis, Verena
    Penner, Alexander
    [J]. 2022 IEEE 7TH EUROPEAN SYMPOSIUM ON SECURITY AND PRIVACY (EUROS&P 2022), 2022, : 320 - 336
  • [35] A Review of Transformer-Based Approaches for Image Captioning
    Ondeng, Oscar
    Ouma, Heywood
    Akuon, Peter
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (19):
  • [36] ThaiTC:Thai Transformer-based Image Captioning
    Jaknamon, Teetouch
    Marukatat, Sanparith
    [J]. 2022 17TH INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING (ISAI-NLP 2022) / 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INTERNET OF THINGS (AIOT 2022), 2022,
  • [37] A transformer-based Urdu image caption generation
    Muhammad Hadi
    Iqra Safder
    Hajra Waheed
    Farooq Zaman
    Naif Radi Aljohani
    Raheel Nawaz
    Saeed Ul Hassan
    Raheem Sarwar
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2024, 15 (9) : 3441 - 3457
  • [38] Transformer-based Residual Network for Hyperspectral Snapshot Compressive Reconstruction
    Huang, Junru
    Sun, Yubao
    Wen, Jiaxuan
    Liu, Qingshan
    [J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 5075 - 5081
  • [39] HIPA: Hierarchical Patch Transformer for Single Image Super Resolution
    Cai, Qing
    Qian, Yiming
    Li, Jinxing
    Lyu, Jun
    Yang, Yee-Hong
    Wu, Feng
    Zhang, David
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 3226 - 3237
  • [40] DEMD- based Image Compression Scheme in a Compressive Sensing Framework
    Jha, Mithilesh Kumar
    Lall, Brejesh
    Roy, Sumantra Dutta
    [J]. JOURNAL OF PATTERN RECOGNITION RESEARCH, 2014, 9 (01): : 64 - 78