Human motion similarity evaluation based on deep metric learning

被引：0

作者：

Yidan Zhang ^{[1
]}

Lei Nie ^{[1
]}

机构：

[1] Beihua University,College of Sports

来源：

Scientific Reports | / 14卷 / 1期

关键词：

Deep metric learning; Human motion similarity evaluation; Automatic encoder-decoder network; Human skeleton structure information; Dynamic time warping algorithm;

D O I：

10.1038/s41598-024-81762-8

中图分类号：

学科分类号：

摘要：

In order to eliminate the impact of camera viewpoint factors and human skeleton differences on the action similarity evaluation and to address the issue of human action similarity evaluation under different viewpoints, a method based on deep metric learning is proposed in this article. The method trains an automatic encoder-decoder deep neural network model by means of a homemade synthetic dataset, which maps the 2D human skeletal key point sequence samples extracted from motion videos into three potential low-dimensional dense spaces. Action feature vectors independent of camera viewpoint and human skeleton structure are extracted in the low-dimensional dense spaces, and motion similarity metrics are performed based on these features, thereby effectively eliminating the effects of camera viewpoint and human skeleton size differences on motion similarity evaluation. Specifically, when extracting the action information feature vectors using the automatic encoder-decoder network model, a sliding window method is used to divide the key point sequences of each limb part into sequence patches, and the action information feature vectors independent of the camera viewpoint and skeleton structure are extracted in a smaller time unit, so as to obtain a more refined action similarity evaluation result. In addition, the dynamic time warping (DWT) algorithm is exploited to align the sequence of action information feature vectors temporally, which solves the problem of temporal axis discrepancies in realizing similarity metrics based on action information feature vectors. More accurate and reliable human action similarity evaluation results were achieved by the loss function composed of three components, namely, cross-reconstruction loss, reconstruction loss and triplet loss. Finally, the performance of the algorithm is evaluated in a homemade dataset, and the experimental results show that the method could effectively eliminate the influence of the differences in camera viewpoints and human skeleton sizes on the similarity evaluation of actions, and generate more reliable and closer to the human subjective perception of similarity evaluation results for human actions captured from different viewpoints or with varying skeleton sizes.

引用

下载

共 50 条

[1] Evaluation of tea similarity based on deep metric learning
Song Y.
Zhao L.
Ning J.
Dai Q.
Cheng F.
Nongye Gongcheng Xuebao/Transactions of the Chinese Society of Agricultural Engineering, 2023, 39 (02): : 260 - 269
[2] Human Motion Analysis with Deep Metric Learning
Coskun, Huseyin
Tan, David Joseph
Conjeti, Sailesh
Navab, Nassir
Tombari, Federico
COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 : 693 - 710
[3] Deep Metric Learning Autoencoder for Nonlinear Temporal Alignment of Human Motion
Yin, Xiaochuan
Chen, Qijun
2016 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2016, : 2160 - 2166
[4] Design and similarity evaluation on humanoid motion based on human motion capture
Huang, Qiang
Yu, Zhangguo
Zhang, Weimin
Xu, Wei
Chen, Xuechao
ROBOTICA, 2010, 28 : 737 - 745
[5] Simultaneous Similarity-based Self-Distillation for Deep Metric Learning
Roth, Karsten
Milbich, Timo
Ommer, Bjorn
Cohen, Joseph Paul
Ghassemi, Marzyeh
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[6] Application of deep metric learning to molecular graph similarity
Damien E. Coupry
Peter Pogány
Journal of Cheminformatics, 14
[7] Application of deep metric learning to molecular graph similarity
Coupry, Damien E.
Pogany, Peter
JOURNAL OF CHEMINFORMATICS, 2022, 14 (01)
[8] Metric learning by similarity network for deep semi-supervised learning
Wu, Sanyou
Feng, Xingdong
Zhou, Fan
DEVELOPMENTS OF ARTIFICIAL INTELLIGENCE TECHNOLOGIES IN COMPUTATION AND ROBOTICS, 2020, 12 : 995 - 1002
[9] Robust Human Activity Recognition based on Deep Metric Learning
Abdu-Aguye, Mubarak G.
Gomaa, Walid
ICINCO: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS, VOL 1, 2019, : 656 - 663
[10] A structural similarity metric for video based on motion models
Seshadrinathan, Kalpana
Bovik, Alan C.
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PTS 1-3, PROCEEDINGS, 2007, : 869 - 872

← 1 2 3 4 5 →