Latent Embeddings for Zero-shot Classification

被引:473
|
作者
Xian, Yongqin [1 ]
Akata, Zeynep [1 ]
Sharma, Gaurav [1 ,2 ,4 ]
Nguyen, Quynh [3 ]
Hein, Matthias [3 ]
Schiele, Bernt [1 ]
机构
[1] MPI Informat, Saarbrucken, Germany
[2] IIT Kanpur, Kanpur, Uttar Pradesh, India
[3] Saarland Univ, Saarbrucken, Germany
[4] Indian Inst Technol Kanpur, CSE, Kanpur, Uttar Pradesh, India
关键词
D O I
10.1109/CVPR.2016.15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a novel latent embedding model for learning a compatibility function between image and class embeddings, in the context of zero-shot classification. The proposed method augments the state-of-the-art bilinear compatibility model by incorporating latent variables. Instead of learning a single bilinear map, it learns a collection of maps with the selection, of which map to use, being a latent variable for the current image-class pair. We train the model with a ranking based objective function which penalizes incorrect rankings of the true class for a given image. We empirically demonstrate that our model improves the state-of-the-art for various class embeddings consistently on three challenging publicly available datasets for the zero-shot setting. Moreover, our method leads to visually highly interpretable results with clear clusters of different fine-grained object properties that correspond to different latent variable maps.
引用
收藏
页码:69 / 77
页数:9
相关论文
共 50 条
  • [1] Gaze Embeddings for Zero-Shot Image Classification
    Karessli, Nour
    Akata, Zeynep
    Schiele, Bernt
    Bulling, Andreas
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6412 - 6421
  • [2] Zero-Shot Audio Classification Via Semantic Embeddings
    Xie, Huang
    Virtanen, Tuomas
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 1233 - 1242
  • [3] Zero-Shot Audio Classification using Image Embeddings
    Dogan, Duygu
    Xie, Huang
    Heittola, Toni
    Virtanen, Tuomas
    [J]. 2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 1 - 5
  • [4] ZERO-SHOT AUDIO CLASSIFICATION BASED ON CLASS LABEL EMBEDDINGS
    Xie, Huang
    Virtanen, Tuomas
    [J]. 2019 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2019, : 264 - 267
  • [5] Learning Discriminative Latent Attributes for Zero-Shot Classification
    Jiang, Huajie
    Wang, Ruiping
    Shan, Shiguang
    Yang, Yi
    Chen, Xilin
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4233 - 4242
  • [6] Discriminative Latent Visual Space For Zero-Shot Object Classification
    Roy, Abhinaba
    Banerjee, Biplab
    Murino, Vittorio
    [J]. 2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 2552 - 2557
  • [7] Visual Context Embeddings for Zero-Shot Recognition
    Cho, Gunhee
    Choi, Yong Suk
    [J]. 37TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2022, : 1039 - 1047
  • [8] Spatial-Aware Object Embeddings for Zero-Shot Localization and Classification of Actions
    Mettes, Pascal
    Snoek, Cees G. M.
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4453 - 4462
  • [9] Learning Cross-Aligned Latent Embeddings for Zero-Shot Cross-Modal Retrieval
    Lin, Kaiyi
    Xu, Xing
    Gao, Lianli
    Wang, Zheng
    Shen, Heng Tao
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11515 - 11522
  • [10] Semantic embeddings of generic objects for zero-shot learning
    Hascoet, Tristan
    Ariki, Yasuo
    Takiguchi, Tetsuya
    [J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2019, 2019 (1)