Human motion similarity evaluation based on deep metric learning

被引：0

作者：

Yidan Zhang ^{[1
]}

Lei Nie ^{[1
]}

机构：

[1] Beihua University,College of Sports

来源：

Scientific Reports | / 14卷 / 1期

关键词：

Deep metric learning; Human motion similarity evaluation; Automatic encoder-decoder network; Human skeleton structure information; Dynamic time warping algorithm;

D O I：

10.1038/s41598-024-81762-8

中图分类号：

学科分类号：

摘要：

In order to eliminate the impact of camera viewpoint factors and human skeleton differences on the action similarity evaluation and to address the issue of human action similarity evaluation under different viewpoints, a method based on deep metric learning is proposed in this article. The method trains an automatic encoder-decoder deep neural network model by means of a homemade synthetic dataset, which maps the 2D human skeletal key point sequence samples extracted from motion videos into three potential low-dimensional dense spaces. Action feature vectors independent of camera viewpoint and human skeleton structure are extracted in the low-dimensional dense spaces, and motion similarity metrics are performed based on these features, thereby effectively eliminating the effects of camera viewpoint and human skeleton size differences on motion similarity evaluation. Specifically, when extracting the action information feature vectors using the automatic encoder-decoder network model, a sliding window method is used to divide the key point sequences of each limb part into sequence patches, and the action information feature vectors independent of the camera viewpoint and skeleton structure are extracted in a smaller time unit, so as to obtain a more refined action similarity evaluation result. In addition, the dynamic time warping (DWT) algorithm is exploited to align the sequence of action information feature vectors temporally, which solves the problem of temporal axis discrepancies in realizing similarity metrics based on action information feature vectors. More accurate and reliable human action similarity evaluation results were achieved by the loss function composed of three components, namely, cross-reconstruction loss, reconstruction loss and triplet loss. Finally, the performance of the algorithm is evaluated in a homemade dataset, and the experimental results show that the method could effectively eliminate the influence of the differences in camera viewpoints and human skeleton sizes on the similarity evaluation of actions, and generate more reliable and closer to the human subjective perception of similarity evaluation results for human actions captured from different viewpoints or with varying skeleton sizes.

引用

下载

共 50 条

[31] Reference-free learning-based similarity metric for motion compensation in cone-beam CT
Huang, H.
Siewerdsen, J. H.
Zbijewski, W.
Weiss, C. R.
Unberath, M.
Ehtiati, T.
Sisniega, A.
PHYSICS IN MEDICINE AND BIOLOGY, 2022, 67 (12):
[32] Fine-grained Patient Similarity Measuring using Deep Metric Learning
Ni, Jiazhi
Liu, Jie
Zhang, Chenxin
Ye, Dan
Ma, Zhirou
CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 1189 - 1198
[33] Learning deep compact similarity metric for kinship verification from face images
Zhou, Xiuzhuang
Jin, Kai
Xu, Min
Gu, Guodong
INFORMATION FUSION, 2019, 48 : 84 - 94
[34] Ranked Similarity Weighting and Top-nk Sampling in Deep Metric Learning
Wang, Jian
Li, Xinyue
Zhang, Zhichao
Song, Wei
Guo, Weiqi
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 7726 - 7735
[35] Multi-Similarity Loss with General Pair Weighting for Deep Metric Learning
Wang, Xun
Han, Xintong
Huang, Weiling
Dong, Dengke
Scott, Matthew R.
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 5017 - 5025
[36] SURVEY ON SENTENCE SIMILARITY EVALUATION USING DEEP LEARNING
Ramaprabha, J.
Das, Sayan
Mukerjee, Pronoy
PROCEEDINGS OF THE 10TH NATIONAL CONFERENCE ON MATHEMATICAL TECHNIQUES AND ITS APPLICATIONS (NCMTA 18), 2018, 1000
[37] Learning a Bag of Features based Nonlinear Metric for Facial Similarity
Lefebvre, Gregoire
Garcia, Christophe
2013 10TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2013), 2013, : 238 - 243
[38] Human Motion Retrieval Based on Deep Learning and Dynamic Time Warping
Xiao, Qinkun
Chu, Chaoqin
2017 2ND INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION ENGINEERING (ICRAE), 2017, : 426 - 430
[39] Metric Learning from Poses for Temporal Clustering of Human Motion
Lopez-Mendez, Adolfo
Gall, Juergen
Casas, Josep R.
van Gool, Luc
PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2012, 2012,
[40] Quasi Cosine Similarity Metric Learning
Wu, Xiang
Shi, Zhi-Guo
Liu, Lei
COMPUTER VISION - ACCV 2014 WORKSHOPS, PT III, 2015, 9010 : 194 - 205

← 1 2 3 4 5 →