Diffusion-Based Unsupervised Pre-training for Automated Recognition of Vitality Forms

被引:0
|
作者
Canovi, Noemi [1 ]
Montagna, Federico [1 ]
Niewiadomski, Radoslaw [2 ]
Sciutti, Alessandra [3 ]
Di Cesare, Giuseppe [3 ,4 ]
Beyan, Cigdem [5 ]
机构
[1] Univ Trento, Dep Informat Engn & Comp Sci, Trento, Italy
[2] Univ Genoa, Dept Informat Bioengn Robot & Syst Engn, Genoa, Italy
[3] Ist Italiano Tecnol, CONTACT Unit, Genoa, Italy
[4] Univ Parma, Dept Med & Surg, Parma, Italy
[5] Univ Verona, Dept Comp Sci, Verona, Italy
基金
欧洲研究理事会;
关键词
Vitality forms; nonverbal communication; unsupervised pre-training; diffusion models; autoencoders; gestures; actions; trajectory; EXPRESSION;
D O I
10.1145/3656650.3656689
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Social communication involves interpreting nonverbal behaviors, detecting and anticipating others' actions and intentions. Actions convey not only the goal and motor intention but also the form, i.e., variations in action execution. These variations, termed vitality forms, communicate attitudes during interactions, such as being gentle, calm, vigorous, and rude. Automatic vitality form recognition may have several applications in social robotics, social skills training, and therapy, yet it remains a rarely studied topic. This paper introduces an unsupervised pre-training approach that utilizes 2D-body key point trajectories as input and employs diffusion models to derive more effective features for representing these trajectories. The features learned from the diffusion model's encoder are utilized to train a multilayer perceptron for vitality form recognition. Experimental analysis showcases the superior performance of the proposed method not only across various videos but also for action classes not encountered during training.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] A Diffusion-Based Pre-training Framework for Crystal Property Prediction
    Song, Zixing
    Meng, Ziqiao
    King, Irwin
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 8, 2024, : 8993 - 9001
  • [2] A Study of Speech Recognition for Kazakh Based on Unsupervised Pre-Training
    Meng, Weijing
    Yolwas, Nurmemet
    [J]. SENSORS, 2023, 23 (02)
  • [3] Diffusion-based normality pre-training for weakly supervised video anomaly detection
    Basak, Suvramalya
    Gautam, Anjali
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 251
  • [4] wav2vec: Unsupervised Pre-training for Speech Recognition
    Schneider, Steffen
    Baevski, Alexei
    Collobert, Ronan
    Auli, Michael
    [J]. INTERSPEECH 2019, 2019, : 3465 - 3469
  • [5] MULTI-MODAL PRE-TRAINING FOR AUTOMATED SPEECH RECOGNITION
    Chan, David M.
    Ghosh, Shalini
    Chakrabarty, Debmalya
    Hoffmeister, Bjorn
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 246 - 250
  • [6] Unsupervised Pre-Training for Detection Transformers
    Dai, Zhigang
    Cai, Bolun
    Lin, Yugeng
    Chen, Junying
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) : 12772 - 12782
  • [7] Unsupervised Pre-Training for Voice Activation
    Kolesau, Aliaksei
    Sesok, Dmitrij
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (23): : 1 - 13
  • [8] TRANSFORMER BASED UNSUPERVISED PRE-TRAINING FOR ACOUSTIC REPRESENTATION LEARNING
    Zhang, Ruixiong
    Wu, Haiwei
    Li, Wubo
    Jiang, Dongwei
    Zou, Wei
    Li, Xiangang
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6933 - 6937
  • [9] Unsupervised Feature Pre-training of the Scattering Wavelet Transform for Musical Genre Recognition
    Klec, Mariusz
    Korzinek, Danijel
    [J]. INTERNATIONAL WORKSHOP ON INNOVATIONS IN INFORMATION AND COMMUNICATION SCIENCE AND TECHNOLOGY, IICST 2014, 2014, 18 : 133 - 139
  • [10] PERFORMANCE-EFFICIENCY TRADE-OFFS IN UNSUPERVISED PRE-TRAINING FOR SPEECH RECOGNITION
    Wu, Felix
    Kim, Kwangyoun
    Pan, Jing
    Han, Kyu J.
    Weinberger, Kilian Q.
    Artzi, Yoav
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7667 - 7671