MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning

被引:0
|
作者
Farina, Matteo [1 ]
Mancini, Massimiliano [1 ]
Cunegatti, Elia [1 ]
Liu, Gaowen [2 ]
Iacca, Giovanni [1 ]
Ricci, Elisa [1 ,3 ]
机构
[1] Univ Trento, Trento, Italy
[2] Cisco Res, Res Triangle Pk, NC USA
[3] Fdn Bruno Kessler, Povo, Italy
关键词
D O I
10.1109/CVPR52733.2024.01532
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While excellent in transfer learning, Vision-Language models (VLMs) come with high computational costs due to their large number of parameters. To address this issue, removing parameters via model pruning is a viable solution. However, existing techniques for VLMs are task-specific, and thus require pruning the network from scratch for each new task of interest. In this work, we explore a new direction: Task-Agnostic Vision-Language Pruning (TA-VLP). Given a pretrained VLM, the goal is to find a unique pruned counterpart transferable to multiple unknown downstream tasks. In this challenging setting, the transferable representations already encoded in the pretrained model are a key aspect to preserve. Thus, we propose Multimodal Flow Pruning (MULTIFLOW), a first, gradient-free, pruning framework for TA-VLP where: (i) the importance of a parameter is expressed in terms of its magnitude and its information flow, by incorporating the saliency of the neu-rons it connects; and (ii) pruning is driven by the emergent (multimodal) distribution of the VLM parameters after pretraining. We benchmark eight state-of-the-art pruning algorithms in the context of TA-VLP, experimenting with two VLMs, three vision-language tasks, and three pruning ratios. Our experimental results show that MULTIFLOW outperforms recent sophisticated, combinatorial competitors in the vast majority of the cases, paving the way towards addressing TA-VLP. The code is publicly available at https://github.com/FarinaMatteo/multiflow.
引用
收藏
页码:16185 / 16195
页数:11
相关论文
共 49 条
  • [11] Multi-Robot Scene Completion: Towards Task-Agnostic Collaborative Perception
    Li, Yiming
    Zhang, Juexiao
    Ma, Dekun
    Wang, Yue
    Feng, Chen
    CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 2062 - 2072
  • [12] Towards Understanding Task-agnostic Debiasing Through the Lenses of Intrinsic Bias and Forgetfulness
    Liu, Guangliang
    Afshari, Milad
    Zhang, Xitong
    Xue, Zhiyu
    Ghosh, Avrajit
    Bashyal, Bidhan
    Wang, Rongrong
    Johnson, Kristen Marie
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 1843 - 1856
  • [13] Towards Better Vision-Inspired Vision-Language Models
    Cao, Yun-Hao
    Ji, Kaixiang
    Huang, Ziyuan
    Zheng, Chuanyang
    Liu, Jiajia
    Wang, Jian
    Chen, Jingdong
    Yang, Ming
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 13537 - 13547
  • [14] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning
    You, Haoran
    Li, Baopu
    Sun, Zhanyi
    Xu Ouyang
    Lin, Yingyan
    COMPUTER VISION, ECCV 2022, PT XI, 2022, 13671 : 674 - 690
  • [15] Towards Learning Generalizable Code Embeddings Using Task-agnostic Graph Convolutional Networks
    Ding, Zishuo
    Li, Heng
    Shang, Weiyi
    Chen, Tse-Hsun
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2023, 32 (02)
  • [16] Towards an Exhaustive Evaluation of Vision-Language Foundation Models
    Salin, Emmanuelle
    Ayache, Stephane
    Favre, Benoit
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 339 - 352
  • [17] Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models
    Xu, Dongkuan
    Mukherjee, Subhabrata
    Liu, Xiaodong
    Dey, Debadeepta
    Wang, Wenhui
    Zhang, Xiang
    Awadallah, Ahmed Hassan
    Gao, Jianfeng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [18] Pivotal Role of Language Modeling in Recommender Systems: Enriching Task-specific and Task-agnostic Representation Learning
    Shin, Kyuyong
    Kwak, Hanock
    Kim, Wonjae
    Jeong, Jisu
    Jung, Seungjae
    Kim, Kyung-Min
    Ha, Jung-Woo
    Lee, Sang-Woo
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 1146 - 1161
  • [19] VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding
    Xu, Hu
    Ghosh, Gargi
    Huang, Po-Yao
    Arora, Prahal
    Aminzadeh, Masoumeh
    Feichtenhofer, Christoph
    Metze, Florian
    Zettlemoyer, Luke
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4227 - 4239
  • [20] Multi-task Learning of Hierarchical Vision-Language Representation
    Duy-Kien Nguyen
    Okatani, Takayuki
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10484 - 10493