Reconsidering learnable fine-grained text prompts for few-shot anomaly detection in visual-language models

被引:0
|
作者
Han, Delong [1 ,2 ]
Xu, Luo [1 ,2 ]
Zhou, Mingle [1 ,2 ]
Wan, Jin [1 ,2 ]
Li, Min [1 ,2 ]
Li, Gang [1 ,2 ]
机构
[1] Qilu Univ Technol, Key Lab Comp Power Network & Informat Secur, Minist Educ,Shandong Comp Sci Ctr, Natl Supercomp Ctr Jinan,Shandong Acad Sci, Jinan 250353, Shandong, Peoples R China
[2] Shandong Fundamental Res Ctr Comp Sci, Shandong Prov Key Lab Comp Power Internet & Serv C, Jinan 250014, Peoples R China
关键词
Industrial anomaly detection; Fine-grained text prompts; Few-Shot Anomaly Detection; Pre-trained visual-language models;
D O I
10.1016/j.neunet.2024.106906
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-Shot Anomaly Detection (FSAD) in industrial images aims to identify abnormalities using only a few normal images, which is crucial for industrial scenarios where sample training is limited. The recent advances in large-scale pre-trained visual-language models have brought significant improvements to the FSAD, which typically requires hundreds of text prompts to be manually crafted through prompt engineering. However, manually designed text prompts cannot accurately match the informative features of different categories across diverse images, and the domain gap between train and test datasets can severely impact the generalization capability of text prompts. To address these issues, we propose a visual-language model based on fine-grained learnable text prompts as a unified general framework for FSAD in industry. Firstly, we design a Fine-grained Text Prompts Adapter (FTPA) and an associated registration loss to enhance the efficiency of text prompts. The manually designed text prompts are improved and optimized by capturing normal and abnormal semantic information in the image, so that the text prompts can describe the image semantic information at a finer granularity. In addition, we introduce a Dynamic Modulation Mechanism (DMM) to avoid potential errors in text prompts post-training due to the agnostic during cross-dataset detection. This is achieved by explicitly modulating the branch guided by few-shot images and the branch guided by fine-grained text prompts. Extensive experiments demonstrate that our proposed method achieves state-of-the-art few-shot industrial anomaly detection and segmentation performance. In the 4-shot, the AUROC of the anomaly classification and anomaly segmentation achieves 98.3%, 96.3%, and 93.8%, 97.9% on the MVTec-AD and VisA datasets, respectively.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Fine-grained Relational Learning for Few-shot Knowledge Graph Completion
    Yuan, Xu
    Lei, Qihang
    Yu, Shuo
    Xu, Chengchuan
    Chen, Zhikui
    APPLIED COMPUTING REVIEW, 2022, 22 (03): : 25 - 38
  • [22] Few-Shot Font Generation by Learning Fine-Grained Local Styles
    Tang, Licheng
    Cai, Yiyang
    Liu, Jiaming
    Hong, Zhibin
    Gong, Mingming
    Fan, Minhu
    Han, Junyu
    Liu, Jingtuo
    Ding, Errui
    Wang, Jingdong
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 7885 - 7894
  • [23] Prompt engineering for zero-shot and few-shot defect detection and classification using a visual-language pretrained model
    Yong, Gunwoo
    Jeon, Kahyun
    Gil, Daeyoung
    Lee, Ghang
    COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2023, 38 (11) : 1536 - 1554
  • [24] FMVP: Fine-grained Meta Visual Prompt enabled domain-specific few-shot classification
    Li, Minghui
    Yao, Hongxun
    NEUROCOMPUTING, 2025, 633
  • [25] Few-shot Incremental Learning with Textual-knowledge Embedding by Visual-language Model
    Yao H.-T.
    Yu L.
    Xu C.-S.
    Ruan Jian Xue Bao/Journal of Software, 2024, 35 (05): : 2101 - 2119
  • [26] A fine-grained self-adapting prompt learning approach for few-shot learning with pre-trained language models
    Chen, Xiaojun
    Liu, Ting
    Fournier-Viger, Philippe
    Zhang, Bowen
    Long, Guodong
    Zhang, Qin
    KNOWLEDGE-BASED SYSTEMS, 2024, 299
  • [27] Few-shot Named Entity Recognition Based on Fine-grained Prototypical Network
    Qi, Rong-Zhi
    Zhou, Jun-Yu
    Li, Shui-Yan
    Mao, Ying-Chi
    Ruan Jian Xue Bao/Journal of Software, 2024, 35 (10): : 4751 - 4765
  • [28] Few-Shot Learning for Fine-Grained Emotion Recognition Using Physiological Signals
    Zhang, Tianyi
    El Ali, Abdallah
    Hanjalic, Alan
    Cesar, Pablo
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 3773 - 3787
  • [29] Power Normalizations in Fine-Grained Image, Few-Shot Image and Graph Classification
    Koniusz, Piotr
    Zhang, Hongguang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (02) : 591 - 609
  • [30] Boosting Few-Shot Fine-Grained Recognition With Background Suppression and Foreground Alignment
    Zha, Zican
    Tang, Hao
    Sun, Yunlian
    Tang, Jinhui
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (08) : 3947 - 3961