An Image Patch is a Wave: Phase-Aware Vision MLP

被引:67
|
作者
Tang, Yehui [1 ,2 ]
Han, Kai [2 ]
Guo, Jianyuan [2 ,3 ]
Xu, Chang [3 ]
Li, Yanxi [2 ,3 ]
Xu, Chao [1 ]
Wang, Yunhe [2 ]
机构
[1] Peking Univ, Sch Artificial Intelligence, Beijing, Peoples R China
[2] Huawei Noahs Ark Lab, Hong Kong, Peoples R China
[3] Univ Sydney, Sch Comp Sci, Sydney, NSW, Australia
基金
中国国家自然科学基金; 澳大利亚研究理事会;
关键词
D O I
10.1109/CVPR52688.2022.01066
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the field of computer vision, recent works show that a pure MLP architecture mainly stacked by fully-connected layers can achieve competing performance with CNN and transformer. An input image of vision MLP is usually split into multiple tokens (patches), while the existing MLP models directly aggregate them with fixed weights, neglecting the varying semantic information of tokens from different images. To dynamically aggregate tokens, we propose to represent each token as a wave function with two parts, amplitude and phase. Amplitude is the original feature and the phase term is a complex value changing according to the semantic contents of input images. Introducing the phase term can dynamically modulate the relationship between tokens and fixed weights in MLP. Based on the wave-like token representation, we establish a novel Wave-MLP architecture for vision tasks. Extensive experiments demonstrate that the proposed Wave-MLP is superior to the state-of-the-art MLP architectures on various vision tasks such as image classification, object detection and semantic segmentation. The source code is available at https://github.com/huawei-noah/CV-Backbones/tree/master/wavem1p_pytorch and https://gitee.com/mindspore/models/tree/master/research/cv/wave_m1p.
引用
收藏
页码:10925 / 10934
页数:10
相关论文
共 50 条
  • [1] Single-image super-resolution reconstruction based on phase-aware visual multi-layer perceptron (MLP)
    Shi, Changteng
    Li, Mengjun
    An, Zhiyong
    PeerJ Computer Science, 2024, 10
  • [2] Single-image super-resolution reconstruction based on phase-aware visual multi-layer perceptron (MLP)
    Shi, Changteng
    Li, Mengjun
    An, Zhiyong
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [3] Phase-aware remote profiling
    Nagpurkar, P
    Krintz, C
    Sherwood, T
    CGO 2005: INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION, 2005, : 191 - 202
  • [4] Phase-Aware CPU Workload Forecasting
    Alcorta, Erika S.
    Rama, Pranav
    Ramachandran, Aswin
    Gerstlauer, Andreas
    EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION, SAMOS 2021, 2022, 13227 : 195 - 209
  • [5] Phase-Aware Optimization in Approximate Computing
    Mitra, Subrata
    Gupta, Manish K.
    Misailovic, Sasa
    Bagchi, Saurabh
    CGO'17: PROCEEDINGS OF THE 2017 INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION, 2017, : 185 - 196
  • [6] The Case for Phase-Aware Scheduling of Parallelizable Jobs
    Berg B.
    Whitehouse J.
    Moseley B.
    Wang W.
    Harchol-Balter M.
    Performance Evaluation Review, 2021, 49 (03): : 65 - 66
  • [7] The case for phase-aware scheduling of parallelizable jobs
    Berg, Benjamin
    Whitehouse, Justin
    Moseley, Benjamin
    Wang, Weina
    Harchol-Balter, Mor
    PERFORMANCE EVALUATION, 2022, 153
  • [8] Phase-aware echocardiogram stabilization using keyframes
    Wu, Hui
    Huynh, Toan T.
    Souvenir, Richard
    MEDICAL IMAGE ANALYSIS, 2017, 35 : 172 - 180
  • [9] Theory and practice of phase-aware ensemble forecasting
    Schulte, Justin
    Georgas, Nickitas
    QUARTERLY JOURNAL OF THE ROYAL METEOROLOGICAL SOCIETY, 2018, 144 (714) : 1415 - 1428
  • [10] X-MLP: A Patch Embedding-Free MLP Architecture for Vision
    Wang, Xinyue
    Cai, Zhicheng
    Peng, Chenglei
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,