The Role of ViT Design and Training in Robustness to Common Corruptions

被引:0
|
作者
Tian, Rui [1 ,2 ]
Wu, Zuxuan [1 ,2 ]
Dai, Qi [3 ]
Goldblum, Micah [4 ]
Hu, Han
Jiang, Yu-Gang [1 ,2 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Intelligent Informat Proc, Shanghai 201203, Peoples R China
[2] Shanghai Collaborat Innovat Ctr Intelligent Visual, Shanghai 201203, Peoples R China
[3] Microsoft Res Asia, Beijing 100080, Peoples R China
[4] NYU, Ctr Data Sci, New York, NY 10012 USA
关键词
Robustness; Training; Transformers; Data augmentation; Benchmark testing; Accuracy; Noise; Computer vision; Standards; Resilience; Common corruptions; robustness; vision transformer;
D O I
10.1109/TMM.2024.3521721
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Vision transformer (ViT) variants have made rapid advances on a variety of computer vision tasks. However, their performance on corrupted inputs, which are inevitable in realistic use cases due to variations in lighting and weather, has not been explored comprehensively. In this paper, we probe the robustness gap among ViT variants and ask how these modern architectural developments affect performance under common types of corruption. Through extensive and rigorous benchmarking, we demonstrate that simple architectural designs such as overlapping patch embedding and convolutional feed-forward networks can promote the robustness of ViTs. Moreover, since the de facto training of ViTs relies heavily on data augmentation, exactly which augmentation strategies make ViTs more robust is worth investigating. We survey the efficacy of previous methods and verify that adversarial noise training is powerful. In addition, we introduce a novel conditional method for generating dynamic augmentation parameters conditioned on input images, which offers state-of-the-art robustness to common corruptions.
引用
收藏
页码:1374 / 1385
页数:12
相关论文
共 50 条
  • [1] Improving Robustness of DNNs against Common Corruptions via Gaussian Adversarial Training
    Yi, Chenyu
    Li, Haoliang
    Wan, Renjie
    Kot, Alex C.
    2020 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2020, : 17 - 20
  • [2] NoisyMix: Boosting Model Robustness to Common Corruptions
    Erichson, N. Benjamin
    Lim, Soon Hoe
    Xu, Winnie
    Utera, Francisco
    Cao, Ziang
    Mahoney, Michael W.
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [3] Benchmarking the Robustness of UAV Tracking Against Common Corruptions
    Liu, Xiaoqiong
    Feng, Yunhe
    Hu, Shu
    Yuan, Xiaohui
    Fan, Heng
    2024 IEEE 7TH INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL, MIPR 2024, 2024, : 465 - 470
  • [4] Exploring the Robustness of Human Parsers Toward Common Corruptions
    Zhang, Sanyi
    Cao, Xiaochun
    Wang, Rui
    Qi, Guo-Jun
    Zhou, Jie
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 5394 - 5407
  • [5] Benchmarking the Robustness of Semantic Segmentation Models with Respect to Common Corruptions
    Kamann, Christoph
    Rother, Carsten
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (02) : 462 - 483
  • [6] On the Effectiveness of Adversarial Training Against Common Corruptions
    Kireev, Klim
    Andriushchenko, Maksym
    Flammarion, Nicolas
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, VOL 180, 2022, 180 : 1012 - 1021
  • [7] Improving robustness against common corruptions with frequency biased models
    Saikia, Tonmoy
    Schmid, Cordelia
    Brox, Thomas
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10191 - 10200
  • [8] PRIME: A Few Primitives Can Boost Robustness to Common Corruptions
    Modas, Apostolos
    Rade, Rahul
    Ortiz-Jimenez, Guillermo
    Moosavi-Dezfooli, Seyed-Mohsen
    Frossard, Pascal
    COMPUTER VISION, ECCV 2022, PT XXV, 2022, 13685 : 623 - 640
  • [9] Improving robustness against common corruptions by covariate shift adaptation
    Schneider, Steffen
    Rusak, Evgenia
    Eck, Luisa
    Bringmann, Oliver
    Brendel, Wieland
    Bethge, Matthias
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [10] Benchmarking the Robustness of Semantic Segmentation Models with Respect to Common Corruptions
    Christoph Kamann
    Carsten Rother
    International Journal of Computer Vision, 2021, 129 : 462 - 483