Improving metric-based few-shot learning with dynamically scaled softmax loss

被引：4

作者：

Zhang, Yu ^{[1
]}

Zuo, Xin ^{[1
]}

Zheng, Xuxu ^{[2
]}

Gao, Xiaoyong ^{[1
]}

Wang, Bo ^{[3
,4
,7
]}

Hu, Weiming ^{[3
,5
,6
]}

机构：

[1] China Univ Petr, Beijing 102249, Peoples R China

[2] Chinese Acad Sci, Data Intelligence Syst Res Ctr, Inst Comp Technol, Beijing 100190, Peoples R China

[3] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China

[4] Peking Univ, Sch Software & Microelect, Beijing 100871, Peoples R China

[5] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100190, Peoples R China

[6] ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai 201210, Peoples R China

[7] Chinese Acad Sci, Natl Lab Pattern Recognit, Inst Automat, 95 Zhongguancun East Rd, Beijing 100190, Peoples R China

来源：

IMAGE AND VISION COMPUTING | 2023年 / 140卷

基金：

北京市自然科学基金;

关键词：

Few-shot learning; Metric-based learning framework; Softmax loss improvement; ALIGNMENT;

D O I：

10.1016/j.imavis.2023.104860

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The metric-based learning framework has been widely used in data-scarce few-shot visual classification. However, the current loss function limits the effectiveness of metric learning. One issue is that the nearest neighbor classification technique used greatly narrows the value range of similarity between the query and class prototypes, which limits the guiding ability of the loss function. The other issue is that the episode-based training setting randomizes the class combination in each iteration, which reduces the perception of the traditional softmax losses for effective learning from episodes with various data distributions.To solve these problems, we first review some variants of the softmax loss from a unified perspective, and then propose a novel DynamicallyScaled Softmax Loss (DSSL). By adding a probability regulator (for scaling probabilities) and a loss regulator (for scaling losses), the loss function can adaptively adjust the prediction distribution and the training weights of the samples, which forces the model to focus on more informative samples. Finally, we found the proposed DSSL strategy for few-shot classifiers can achieve competitive results on four generic benchmarks and a fine-grained benchmark, demonstrating the effectiveness of improving the distinguishability (for base classes) and generalizability (for novel classes) of the learned feature space.

引用

页数：15