Learning shared embedding representation of motion and text using contrastive learning

被引:0
|
作者
Junpei Horie
Wataru Noguchi
Hiroyuki Iizuka
Masahito Yamamoto
机构
[1] Hokkaido University,Graduate School of Information Science and Technology
[2] Hokkaido University,Faculty of Information Science and Technology
[3] Hokkaido University,Center for Human Nature, Artificial Intelligence, and Neuroscience
来源
关键词
Multi-modal learning; Contrastive learning; Skeleton-based action recognition; Motion retrieval;
D O I
暂无
中图分类号
学科分类号
摘要
Multimodal learning of motion and text tries to find the correspondence between skeletal time-series data acquired by motion capture and the text that describes the motion. In this field, good associations can realize both motion-to-text and text-to-motion applications. However, the previous methods failed to associate motion with text, taking into account details of descriptions, for example, whether to move the left or right arm. In this paper, we propose a motion-text contrastive learning method for making correspondences between motion and text in a shared embedding space. We showed that our model outperforms the previous studies in the task of action recognition. We also qualitatively show that, by using a pre-trained text encoder, our model can perform motion retrieval with detailed correspondences between motion and text.
引用
收藏
页码:148 / 157
页数:9
相关论文
共 50 条
  • [11] Motion Sensitive Contrastive Learning for Self-supervised Video Representation
    Ni, Jingcheng
    Zhou, Nan
    Qin, Jie
    Wu, Qian
    Liu, Junqi
    Li, Boxun
    Huang, Di
    COMPUTER VISION - ECCV 2022, PT XXXV, 2022, 13695 : 457 - 474
  • [12] Contrastive learning with text augmentation for text classification
    Jia, Ouyang
    Huang, Huimin
    Ren, Jiaxin
    Xie, Luodi
    Xiao, Yinyin
    APPLIED INTELLIGENCE, 2023, 53 (16) : 19522 - 19531
  • [13] Contrastive learning with text augmentation for text classification
    Ouyang Jia
    Huimin Huang
    Jiaxin Ren
    Luodi Xie
    Yinyin Xiao
    Applied Intelligence, 2023, 53 : 19522 - 19531
  • [14] Interactive Audio-text Representation for Automated Audio Captioning with Contrastive Learning
    Chen, Chen
    Hou, Nana
    Hu, Yuchen
    Zou, Heqing
    Qi, Xiaofeng
    Chng, Eng Siong
    INTERSPEECH 2022, 2022, : 2773 - 2777
  • [15] Graph-Based Short Text Clustering via Contrastive Learning with Graph Embedding
    Wei, Yujie
    Zhou, Weidong
    Zhou, Jin
    Wang, Yingxu
    Han, Shiyuan
    Du, Tao
    Yang, Cheng
    Liu, Bowen
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT I, 2023, 14086 : 727 - 738
  • [16] SimSCL: A Simple Fully-Supervised Contrastive Learning Framework for Text Representation
    Moukafih, Youness
    Ghanem, Abdelghani
    Abidi, Karima
    Sbihi, Nada
    Ghogho, Mounir
    Smaili, Kamel
    AI 2021: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, 13151 : 728 - 738
  • [17] Contrastive Masked Image-Text Modeling for Medical Visual Representation Learning
    Chen, Cheng
    Zhong, Aoxiao
    Wu, Dufan
    Luo, Jie
    Li, Quanzheng
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT V, 2023, 14224 : 493 - 503
  • [18] Learning a Few-shot Embedding Model with Contrastive Learning
    Liu, Chen
    Fu, Yanwei
    Xu, Chengming
    Yang, Siqian
    Li, Jilin
    Wang, Chengjie
    Zhang, Li
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 8635 - 8643
  • [19] Fine-Grained Spatiotemporal Motion Alignment for Contrastive Video Representation Learning
    Zhu, Minghao
    Lin, Xiao
    Dang, Ronghao
    Liu, Chengju
    Chen, Qijun
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4725 - 4736
  • [20] Learning Job Representation Using Directed Graph Embedding
    Luo, Haiyan
    Ma, Shichuan
    Selvaraj, Anand Joseph Bernard
    Sun, Yu
    1ST INTERNATIONAL WORKSHOP ON DEEP LEARNING PRACTICE FOR HIGH-DIMENSIONAL SPARSE DATA WITH KDD (DLP-KDD 2019), 2019,