Reconsidering learnable fine-grained text prompts for few-shot anomaly detection in visual-language models

被引:0
|
作者
Han, Delong [1 ,2 ]
Xu, Luo [1 ,2 ]
Zhou, Mingle [1 ,2 ]
Wan, Jin [1 ,2 ]
Li, Min [1 ,2 ]
Li, Gang [1 ,2 ]
机构
[1] Qilu Univ Technol, Key Lab Comp Power Network & Informat Secur, Minist Educ,Shandong Comp Sci Ctr, Natl Supercomp Ctr Jinan,Shandong Acad Sci, Jinan 250353, Shandong, Peoples R China
[2] Shandong Fundamental Res Ctr Comp Sci, Shandong Prov Key Lab Comp Power Internet & Serv C, Jinan 250014, Peoples R China
关键词
Industrial anomaly detection; Fine-grained text prompts; Few-Shot Anomaly Detection; Pre-trained visual-language models;
D O I
10.1016/j.neunet.2024.106906
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-Shot Anomaly Detection (FSAD) in industrial images aims to identify abnormalities using only a few normal images, which is crucial for industrial scenarios where sample training is limited. The recent advances in large-scale pre-trained visual-language models have brought significant improvements to the FSAD, which typically requires hundreds of text prompts to be manually crafted through prompt engineering. However, manually designed text prompts cannot accurately match the informative features of different categories across diverse images, and the domain gap between train and test datasets can severely impact the generalization capability of text prompts. To address these issues, we propose a visual-language model based on fine-grained learnable text prompts as a unified general framework for FSAD in industry. Firstly, we design a Fine-grained Text Prompts Adapter (FTPA) and an associated registration loss to enhance the efficiency of text prompts. The manually designed text prompts are improved and optimized by capturing normal and abnormal semantic information in the image, so that the text prompts can describe the image semantic information at a finer granularity. In addition, we introduce a Dynamic Modulation Mechanism (DMM) to avoid potential errors in text prompts post-training due to the agnostic during cross-dataset detection. This is achieved by explicitly modulating the branch guided by few-shot images and the branch guided by fine-grained text prompts. Extensive experiments demonstrate that our proposed method achieves state-of-the-art few-shot industrial anomaly detection and segmentation performance. In the 4-shot, the AUROC of the anomaly classification and anomaly segmentation achieves 98.3%, 96.3%, and 93.8%, 97.9% on the MVTec-AD and VisA datasets, respectively.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Fine-Grained Prototypes Distillation for Few-Shot Object Detection
    Wang, Zichen
    Yang, Bo
    Yue, Haonan
    Ma, Zhenghao
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 5859 - 5866
  • [2] Few-shot Visual Learning with Contextual Memory and Fine-grained Calibration
    Ma, Yuqing
    Liu, Wei
    Bai, Shihao
    Zhang, Qingyu
    Liu, Aishan
    Chen, Weimin
    Liu, Xianglong
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 811 - 817
  • [3] A few-shot fine-grained image recognition method
    Wang, Jianwei
    Chen, Deyun
    BULLETIN OF THE POLISH ACADEMY OF SCIENCES-TECHNICAL SCIENCES, 2023, 71 (01)
  • [4] Feature alignment via mutual mapping for few-shot fine-grained visual classification
    Wu, Qin
    Song, Tingting
    Fan, Shengnan
    Chen, Zeda
    Jin, Kelei
    Zhou, Haojie
    IMAGE AND VISION COMPUTING, 2024, 147
  • [5] Robust Saliency-Aware Distillation for Few-Shot Fine-Grained Visual Recognition
    Liu, Haiqi
    Chen, C. L. Philip
    Gong, Xinrong
    Zhang, Tong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 7529 - 7542
  • [6] Variational Feature Imitation Conditioned on Visual Descriptions for Few-Shot Fine-Grained Recognition
    Lu, Xin
    Pan, Yixuan
    Cao, Yichao
    Zhou, Xin
    Lu, Xiaobo
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (03) : 2215 - 2229
  • [7] Few-Shot Fine-Grained Image Classification via GNN
    Zhou, Xiangyu
    Zhang, Yuhui
    Wei, Qianru
    SENSORS, 2022, 22 (19)
  • [8] Variational Feature Disentangling for Fine-Grained Few-Shot Classification
    Xu, Jingyi
    Le, Hieu
    Huang, Mingzhen
    Athar, ShahRukh
    Samaras, Dimitris
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 8792 - 8801
  • [9] Few-Shot Fine-Grained Image Classification: A Comprehensive Review
    Ren, Jie
    Li, Changmiao
    An, Yaohui
    Zhang, Weichuan
    Sun, Changming
    AI, 2024, 5 (01) : 405 - 425
  • [10] Structural Subspace Learning for Few-shot Fine-grained Recognition
    Li, Linjia
    Deng, Jin
    Huang, Ying
    Chen, Yanyan
    Luo, Wei
    2024 16TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, ICMLC 2024, 2024, : 693 - 699