Hyperbolic Vision Transformers: Combining Improvements in Metric Learning

被引:27
|
作者
Ermolov, Aleksandr [1 ]
Mirvakhabova, Leyla [2 ]
Khrulkov, Valentin [2 ,3 ]
Sebe, Nicu [1 ]
Oseledets, Ivan [2 ,4 ]
机构
[1] Univ Trento, Trento, Italy
[2] Skolkovo Inst Sci & Technol, Moscow, Russia
[3] Yandex, Moscow, Russia
[4] AIRI, Moscow, Russia
关键词
D O I
10.1109/CVPR52688.2022.00726
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Metric learning aims to learn a highly discriminative model encouraging the embeddings of similar classes to be close in the chosen metrics and pushed apart for dissimilar ones. The common recipe is to use an encoder to extract embeddings and a distance-based loss function to match the representations - usually, the Euclidean distance is utilized. An emerging interest in learning hyperbolic data embeddings suggests that hyperbolic geometry can be beneficial for natural data. Following this line of work, we propose a new hyperbolic-based model for metric learning. At the core of our method is a vision transformer with output embeddings mapped to hyperbolic space. These embeddings are directly optimized using modified pairwise cross-entropy loss. We evaluate the proposed model with six different formulations on four datasets achieving the new state-of-the-art performance. The source code is available at https://github.com/htdt/hyp_metric.
引用
收藏
页码:7399 / 7409
页数:11
相关论文
共 50 条
  • [1] Unsupervised Hyperbolic Metric Learning
    Yan, Jiexi
    Luo, Lei
    Deng, Cheng
    Huang, Heng
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12460 - 12469
  • [2] Combining EfficientNet and Vision Transformers for Video Deepfake Detection
    Coccomini, Davide Alessandro
    Messina, Nicola
    Gennaro, Claudio
    Falchi, Fabrizio
    [J]. IMAGE ANALYSIS AND PROCESSING, ICIAP 2022, PT III, 2022, 13233 : 219 - 229
  • [3] Classification of Adventitious Sounds Combining Cochleogram and Vision Transformers
    Mang, Loredana Daria
    Martinez, Francisco David Gonzalez
    Munoz, Damian Martinez
    Galan, Sebastian Garcia
    Cortina, Raquel
    [J]. SENSORS, 2024, 24 (02)
  • [4] Learning Imbalanced Data with Vision Transformers
    Xu, Zhengzhuo
    Liu, Ruikang
    Yang, Shuo
    Chai, Zenghao
    Yuan, Chun
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15793 - 15803
  • [5] Learning Expressive Prompting With Residuals for Vision Transformers
    Das, Rajshekhar
    Dukler, Yonatan
    Ravichandran, Avinash
    Swarninathan, Ashwin
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 3366 - 3377
  • [6] VisFormers-Combining Vision and Transformers for Enhanced Complex Document Classification
    Dutta, Subhayu
    Adhikary, Subhrangshu
    Dwivedi, Ashutosh Dhar
    [J]. MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2024, 6 (01): : 448 - 463
  • [7] Rainbow: Combining Improvements in Deep Reinforcement Learning
    Hessel, Matteo
    Modayil, Joseph
    van Hasselt, Hado
    Schaul, Tom
    Ostrovski, Georg
    Dabney, Will
    Horgan, Dan
    Piot, Bilal
    Azar, Mohammad
    Silver, David
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3215 - 3222
  • [8] Weeds Classification with Deep Learning: An Investigation Using CNN, Vision Transformers, Pyramid Vision Transformers, and Ensemble Strategy
    Rozendo, Guilherme Botazzo
    Roberto, Guilherme Freire
    Zanchetta do Nascimento, Marcelo
    Neves, Leandro Alves
    Lumini, Alessandra
    [J]. PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2023, PT I, 2024, 14469 : 229 - 243
  • [9] Hyperbolic Deep Learning in Computer Vision: A Survey
    Mettes, Pascal
    Atigh, Mina Ghadimi
    Keller-Ressel, Martin
    Gu, Jeffrey
    Yeung, Serena
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (09) : 3484 - 3508
  • [10] Automated Progressive Learning for Efficient Training of Vision Transformers
    Li, Changlin
    Zhuang, Bohan
    Wang, Guangrun
    Liang, Xiaodan
    Chang, Xiaojun
    Yang, Yi
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12476 - 12486