Human motion similarity evaluation based on deep metric learning

被引:0
|
作者
Yidan Zhang [1 ]
Lei Nie [1 ]
机构
[1] Beihua University,College of Sports
关键词
Deep metric learning; Human motion similarity evaluation; Automatic encoder-decoder network; Human skeleton structure information; Dynamic time warping algorithm;
D O I
10.1038/s41598-024-81762-8
中图分类号
学科分类号
摘要
In order to eliminate the impact of camera viewpoint factors and human skeleton differences on the action similarity evaluation and to address the issue of human action similarity evaluation under different viewpoints, a method based on deep metric learning is proposed in this article. The method trains an automatic encoder-decoder deep neural network model by means of a homemade synthetic dataset, which maps the 2D human skeletal key point sequence samples extracted from motion videos into three potential low-dimensional dense spaces. Action feature vectors independent of camera viewpoint and human skeleton structure are extracted in the low-dimensional dense spaces, and motion similarity metrics are performed based on these features, thereby effectively eliminating the effects of camera viewpoint and human skeleton size differences on motion similarity evaluation. Specifically, when extracting the action information feature vectors using the automatic encoder-decoder network model, a sliding window method is used to divide the key point sequences of each limb part into sequence patches, and the action information feature vectors independent of the camera viewpoint and skeleton structure are extracted in a smaller time unit, so as to obtain a more refined action similarity evaluation result. In addition, the dynamic time warping (DWT) algorithm is exploited to align the sequence of action information feature vectors temporally, which solves the problem of temporal axis discrepancies in realizing similarity metrics based on action information feature vectors. More accurate and reliable human action similarity evaluation results were achieved by the loss function composed of three components, namely, cross-reconstruction loss, reconstruction loss and triplet loss. Finally, the performance of the algorithm is evaluated in a homemade dataset, and the experimental results show that the method could effectively eliminate the influence of the differences in camera viewpoints and human skeleton sizes on the similarity evaluation of actions, and generate more reliable and closer to the human subjective perception of similarity evaluation results for human actions captured from different viewpoints or with varying skeleton sizes.
引用
下载
收藏
相关论文
共 50 条
  • [31] Reference-free learning-based similarity metric for motion compensation in cone-beam CT
    Huang, H.
    Siewerdsen, J. H.
    Zbijewski, W.
    Weiss, C. R.
    Unberath, M.
    Ehtiati, T.
    Sisniega, A.
    PHYSICS IN MEDICINE AND BIOLOGY, 2022, 67 (12):
  • [32] Fine-grained Patient Similarity Measuring using Deep Metric Learning
    Ni, Jiazhi
    Liu, Jie
    Zhang, Chenxin
    Ye, Dan
    Ma, Zhirou
    CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 1189 - 1198
  • [33] Learning deep compact similarity metric for kinship verification from face images
    Zhou, Xiuzhuang
    Jin, Kai
    Xu, Min
    Gu, Guodong
    INFORMATION FUSION, 2019, 48 : 84 - 94
  • [34] Ranked Similarity Weighting and Top-nk Sampling in Deep Metric Learning
    Wang, Jian
    Li, Xinyue
    Zhang, Zhichao
    Song, Wei
    Guo, Weiqi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 7726 - 7735
  • [35] Multi-Similarity Loss with General Pair Weighting for Deep Metric Learning
    Wang, Xun
    Han, Xintong
    Huang, Weiling
    Dong, Dengke
    Scott, Matthew R.
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 5017 - 5025
  • [36] SURVEY ON SENTENCE SIMILARITY EVALUATION USING DEEP LEARNING
    Ramaprabha, J.
    Das, Sayan
    Mukerjee, Pronoy
    PROCEEDINGS OF THE 10TH NATIONAL CONFERENCE ON MATHEMATICAL TECHNIQUES AND ITS APPLICATIONS (NCMTA 18), 2018, 1000
  • [37] Learning a Bag of Features based Nonlinear Metric for Facial Similarity
    Lefebvre, Gregoire
    Garcia, Christophe
    2013 10TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2013), 2013, : 238 - 243
  • [38] Human Motion Retrieval Based on Deep Learning and Dynamic Time Warping
    Xiao, Qinkun
    Chu, Chaoqin
    2017 2ND INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION ENGINEERING (ICRAE), 2017, : 426 - 430
  • [39] Metric Learning from Poses for Temporal Clustering of Human Motion
    Lopez-Mendez, Adolfo
    Gall, Juergen
    Casas, Josep R.
    van Gool, Luc
    PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2012, 2012,
  • [40] Quasi Cosine Similarity Metric Learning
    Wu, Xiang
    Shi, Zhi-Guo
    Liu, Lei
    COMPUTER VISION - ACCV 2014 WORKSHOPS, PT III, 2015, 9010 : 194 - 205