Aligning Actions and Walking to LLM-Generated Textual Descriptions

被引:0
|
作者
Chivereanu, Radu [1 ]
Cosma, Adrian [1 ]
Catruna, Andy [1 ]
Rughinis, Razvan [1 ]
Radoi, Emilian [1 ]
机构
[1] Natl Univ Sci & Technol Politehn Bucharest, Bucharest, Romania
关键词
D O I
10.1109/FG59268.2024.10581994
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large Language Models (LLMs) have demonstrated remarkable capabilities in various domains, including data augmentation and synthetic data generation. This work explores the use of LLMs to generate rich textual descriptions for motion sequences, encompassing both actions and walking patterns. We leverage the expressive power of LLMs to align motion representations with high-level linguistic cues, addressing two distinct tasks: action recognition and retrieval of walking sequences based on appearance attributes. For action recognition, we employ LLMs to generate textual descriptions of actions in the BABEL-60 dataset, facilitating the alignment of motion sequences with linguistic representations. In the domain of gait analysis, we investigate the impact of appearance attributes on walking patterns by generating textual descriptions of motion sequences from the DenseGait dataset using LLMs. These descriptions capture subtle variations in walking styles influenced by factors such as clothing choices and footwear. Our approach demonstrates the potential of LLMs in augmenting structured motion attributes and aligning multi-modal representations. The findings contribute to the advancement of comprehensive motion understanding and open up new avenues for leveraging LLMs in multi-modal alignment and data augmentation for motion analysis. We make the code publicly available at https://github.com/Radu1999/WalkAndText
引用
收藏
页数:7
相关论文
共 41 条
  • [31] Understanding Regular Expression Denial of Service (ReDoS): Insights from LLM-Generated Regexes and Developer Forums
    Siddiq, Mohammed Latif
    Zhang, Jiahao
    Santos, Joanna C. S.
    PROCEEDINGS 2024 32ND IEEE/ACM INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION, ICPC 2024, 2024, : 190 - 201
  • [32] Identifying duplicate functionality in textual use cases by aligning semantic actions
    Alejandro Rago
    Claudia Marcos
    J. Andres Diaz-Pace
    Software & Systems Modeling, 2016, 15 : 579 - 603
  • [33] Identifying duplicate functionality in textual use cases by aligning semantic actions
    Rago, Alejandro
    Marcos, Claudia
    Diaz-Pace, J. Andres
    SOFTWARE AND SYSTEMS MODELING, 2016, 15 (02): : 579 - 603
  • [34] Unsupervised Grounding of Textual Descriptions of Object Features and Actions in Video
    Alomari, Muhannad
    Chinellato, Eris
    Gatsoulis, Yiannis
    Hogg, David C.
    Cohn, Anthony G.
    FIFTEENTH INTERNATIONAL CONFERENCE ON THE PRINCIPLES OF KNOWLEDGE REPRESENTATION AND REASONING, 2016, : 505 - 508
  • [35] LLM-generated tips rival expert-created tips in helping students answer quantum-computing questions
    Lars Krupp
    Jonas Bley
    Isacco Gobbi
    Alexander Geng
    Sabine Müller
    Sungho Suh
    Ali Moghiseh
    Arcesio Castaneda Medina
    Valeria Bartsch
    Artur Widera
    Herwig Ott
    Paul Lukowicz
    Jakob Karolus
    Maximilian Kiefer-Emmanouilidis
    EPJ Quantum Technology, 2025, 12 (1)
  • [36] Enhancing Automated Scoring of Math Self-Explanation Quality Using LLM-Generated Datasets: A Semi-Supervised Approach
    Nakamoto, Ryosuke
    Flanagan, Brendan
    Yamauchi, Taisei
    Dai, Yiling
    Takami, Kyosuke
    Ogata, Hiroaki
    COMPUTERS, 2023, 12 (11)
  • [37] On comparing manual and automatic generated textual descriptions of business process models
    Shahzad, Khurram
    Zaheer, Sheeza
    Nawab, Rao Muhammad Adeel
    Aslam, Faisal
    JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2019, 31 (11)
  • [38] Identifying Duplicate Functionality in Textual Use Cases by Aligning Semantic Actions (SoSyM Abstract)
    Rago, Alejandro
    Marcos, Claudia
    Andres Diaz-Pace, J.
    2015 ACM/IEEE 18TH INTERNATIONAL CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS (MODELS), 2015, : 446 - 446
  • [39] Automated Textual Descriptions for a Wide Range of Video Events with 48 Human Actions
    Hanckmann, Patrick
    Schutte, Klamer
    Burghouts, Gertjan J.
    COMPUTER VISION - ECCV 2012: WORKSHOPS AND DEMONSTRATIONS, PT I, 2012, 7583 : 372 - 380
  • [40] Comparing Manual- and Auto-Generated Textual Descriptions of Business Process Models
    Zaheer, Sheeza
    Shahzad, Khurram
    Nawab, Rao Muhammad Adeel
    2016 SIXTH INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING TECHNOLOGY (INTECH), 2016, : 41 - 46