Aligning Actions and Walking to LLM-Generated Textual Descriptions

被引：0

作者：

Chivereanu, Radu ^{[1
]}

Cosma, Adrian ^{[1
]}

Catruna, Andy ^{[1
]}

Rughinis, Razvan ^{[1
]}

Radoi, Emilian ^{[1
]}

机构：

[1] Natl Univ Sci & Technol Politehn Bucharest, Bucharest, Romania

来源：

2024 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, FG 2024 | 2024年

关键词：

D O I：

10.1109/FG59268.2024.10581994

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Large Language Models (LLMs) have demonstrated remarkable capabilities in various domains, including data augmentation and synthetic data generation. This work explores the use of LLMs to generate rich textual descriptions for motion sequences, encompassing both actions and walking patterns. We leverage the expressive power of LLMs to align motion representations with high-level linguistic cues, addressing two distinct tasks: action recognition and retrieval of walking sequences based on appearance attributes. For action recognition, we employ LLMs to generate textual descriptions of actions in the BABEL-60 dataset, facilitating the alignment of motion sequences with linguistic representations. In the domain of gait analysis, we investigate the impact of appearance attributes on walking patterns by generating textual descriptions of motion sequences from the DenseGait dataset using LLMs. These descriptions capture subtle variations in walking styles influenced by factors such as clothing choices and footwear. Our approach demonstrates the potential of LLMs in augmenting structured motion attributes and aligning multi-modal representations. The findings contribute to the advancement of comprehensive motion understanding and open up new avenues for leveraging LLMs in multi-modal alignment and data augmentation for motion analysis. We make the code publicly available at https://github.com/Radu1999/WalkAndText

引用

页数：7

共 41 条

[31] Understanding Regular Expression Denial of Service (ReDoS): Insights from LLM-Generated Regexes and Developer Forums
Siddiq, Mohammed Latif
Zhang, Jiahao
Santos, Joanna C. S.
PROCEEDINGS 2024 32ND IEEE/ACM INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION, ICPC 2024, 2024, : 190 - 201
[32] Identifying duplicate functionality in textual use cases by aligning semantic actions
Alejandro Rago
Claudia Marcos
J. Andres Diaz-Pace
Software & Systems Modeling, 2016, 15 : 579 - 603
[33] Identifying duplicate functionality in textual use cases by aligning semantic actions
Rago, Alejandro
Marcos, Claudia
Diaz-Pace, J. Andres
SOFTWARE AND SYSTEMS MODELING, 2016, 15 (02): : 579 - 603
[34] Unsupervised Grounding of Textual Descriptions of Object Features and Actions in Video
Alomari, Muhannad
Chinellato, Eris
Gatsoulis, Yiannis
Hogg, David C.
Cohn, Anthony G.
FIFTEENTH INTERNATIONAL CONFERENCE ON THE PRINCIPLES OF KNOWLEDGE REPRESENTATION AND REASONING, 2016, : 505 - 508
[35] LLM-generated tips rival expert-created tips in helping students answer quantum-computing questions
Lars Krupp
Jonas Bley
Isacco Gobbi
Alexander Geng
Sabine Müller
Sungho Suh
Ali Moghiseh
Arcesio Castaneda Medina
Valeria Bartsch
Artur Widera
Herwig Ott
Paul Lukowicz
Jakob Karolus
Maximilian Kiefer-Emmanouilidis
EPJ Quantum Technology, 2025, 12 (1)
[36] Enhancing Automated Scoring of Math Self-Explanation Quality Using LLM-Generated Datasets: A Semi-Supervised Approach
Nakamoto, Ryosuke
Flanagan, Brendan
Yamauchi, Taisei
Dai, Yiling
Takami, Kyosuke
Ogata, Hiroaki
COMPUTERS, 2023, 12 (11)
[37] On comparing manual and automatic generated textual descriptions of business process models
Shahzad, Khurram
Zaheer, Sheeza
Nawab, Rao Muhammad Adeel
Aslam, Faisal
JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2019, 31 (11)
[38] Identifying Duplicate Functionality in Textual Use Cases by Aligning Semantic Actions (SoSyM Abstract)
Rago, Alejandro
Marcos, Claudia
Andres Diaz-Pace, J.
2015 ACM/IEEE 18TH INTERNATIONAL CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS (MODELS), 2015, : 446 - 446
[39] Automated Textual Descriptions for a Wide Range of Video Events with 48 Human Actions
Hanckmann, Patrick
Schutte, Klamer
Burghouts, Gertjan J.
COMPUTER VISION - ECCV 2012: WORKSHOPS AND DEMONSTRATIONS, PT I, 2012, 7583 : 372 - 380
[40] Comparing Manual- and Auto-Generated Textual Descriptions of Business Process Models
Zaheer, Sheeza
Shahzad, Khurram
Nawab, Rao Muhammad Adeel
2016 SIXTH INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING TECHNOLOGY (INTECH), 2016, : 41 - 46

← 1 2 3 4 5 →