Enhancing Learned Image Compression via Cross Window-Based Attention

被引:0
|
作者
Mudgal, Priyanka [1 ]
Liu, Feng [1 ]
机构
[1] Portland State Univ, Portland, OR 97124 USA
关键词
learned image compression; end-to-end image compression;
D O I
10.1007/978-3-031-77389-1_32
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, learned image compression methods have demonstrated superior rate-distortion performance compared to traditional image compression methods. Recent methods utilize convolutional neural networks (CNN), variational autoencoders (VAE), invertible neural networks (INN), and transformers. Despite their significant contributions, a main drawback of these models is their poor performance in capturing local redundancy. Therefore, to leverage global features along with local redundancy, we propose a CNN-based solution integrated with a feature encoding module. The feature encoding module encodes important features before feeding them to the CNN and then utilizes cross-scale window-based attention, which further captures local redundancy. Crossscale window-based attention is inspired by the attention mechanism in transformers and effectively enlarges the receptive field. Both the feature encoding module and the cross-scale window-based attention module in our architecture are flexible and can be incorporated into any other network architecture. We evaluate our method on the Kodak and CLIC datasets and demonstrate that our approach is effective and on par with state-of-the-art methods.
引用
收藏
页码:410 / 423
页数:14
相关论文
共 50 条
  • [31] Window-based transformer generative adversarial network for autonomous underwater image enhancement
    Ummar, Mehnaz
    Dharejo, Fayaz Ali
    Alawode, Basit
    Mahbub, Taslim
    Piran, Md. Jalil
    Javed, Sajid
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [32] A novel configurable VLSI architecture design of window-based image processing method
    Zhao, Hui
    Sang, Hongshi
    Shen, Xubang
    MIPPR 2017: PARALLEL PROCESSING OF IMAGES AND OPTIMIZATION TECHNIQUES; AND MEDICAL IMAGING, 2018, 10610
  • [33] Learned image compression via multiscale prior for machine recognition
    Shi, Yuan
    Shen, Liquan
    Wang, Qiang
    JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (01)
  • [34] Efficient Decoder for Learned Image Compression via Structured Pruning
    Liao, Liewen
    Li, Shaohui
    Luo, Jixiang
    Dai, Wenrui
    Li, Chenglin
    Zou, Junni
    Xiong, Hongkai
    DCC 2022: 2022 DATA COMPRESSION CONFERENCE (DCC), 2022, : 464 - 464
  • [35] Enhanced Window-Based Self-Attention with Global and Multi-Scale Representations for Remote Sensing Image Super-Resolution
    Lu, Yuting
    Wang, Shunzhou
    Wang, Binglu
    Zhang, Xin
    Wang, Xiaoxu
    Zhao, Yongqiang
    REMOTE SENSING, 2024, 16 (15)
  • [36] Multi-resolution Patch and Window-Based Priority for Digital Image Inpainting Problem
    Dang Thanh Trung
    Larabi, Chaker
    Beghdadi, Azeddine
    2012 3RD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING THEORY, TOOLS AND APPLICATIONS, 2012, : 280 - 284
  • [37] A Real-time Window-based Image Processing Architecture using a Mapping Table
    Seok, Min-Shik
    Song, Il-Seuk
    Jin, Seunghun
    Jeon, Jae Wook
    INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2010), 2010, : 1678 - 1681
  • [38] Multi-modal image fusion using window-based ICA and fractal dimension
    Han, Lu
    Kadambe, Shubha
    Krim, Hamid
    2010 CONFERENCE RECORD OF THE FORTY FOURTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2010, : 214 - 218
  • [39] Relevant Window-Based Bitmap Compression in P2P Systems: Framework and Solution
    Li, Chunxi
    Zhang, Baoxian
    Chen, Changjia
    Chiu, Dah Ming
    IEEE TRANSACTIONS ON MULTIMEDIA, 2014, 16 (07) : 1821 - 1833
  • [40] A window-based image processing technique for quantitative and qualitative analysis of road traffic parameters
    Fathy, M
    Siyal, MY
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 1998, 47 (04) : 1342 - 1349