CPoser: An Optimization-after-Parsing Approach for Text-to-PoseGeneration Using Large Language Models

被引:0
|
作者
Li, Yumeng [1 ]
Chen, Bohong [1 ]
Ren, Zhong [1 ]
Ding, Yao-xiang [1 ]
Liu, Libin [2 ]
Shao, Tianjia [1 ]
Zhou, Kun [1 ]
机构
[1] Zhejiang Univ, State Key Lab CAD&CG, Hangzhou, Peoples R China
[2] Peking Univ, State Key Lab Gen AI, Beijing, Peoples R China
来源
ACM TRANSACTIONS ON GRAPHICS | 2024年 / 43卷 / 06期
关键词
Human posture; text-to-pose generation; zero-shot learning; pose priors; large language models;
D O I
10.1145/3687932
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Text-to-pose generation is challenging due to the complexity of naturallanguage and human posture semantics. Utilizing large language models(LLMs) for text-to-pose generation is appealing due to their strong capabili-ties in text understanding and reasoning. However, as LLMs are designed forgeneral-purpose language processing and not specifically trained for posegeneration, it remains nontrivial to generate precise articulation targets for the full body using LLMs directly. To this end, we propose CPoser, a novelapproach to harness the power of LLMs for text-to-pose generation, featur-ing a prompt parsing stage and a pose optimization stage. The parsing stageutilizes LLMs to turn text prompts into pose intermediate representations(Pose-IRs) through a set of predefined structured queries. These Pose-IRsexplicitly describe specific pose conditions, such as squatting depth and kneebending angle, naturally forming an objective function that a target poseshould satisfy. The optimization stage solves for expressive poses and handgestures based on the Pose-IR objective function via robust optimizationin a quantized pose prior space. The results are further refined to enhancenaturalness and incorporate facial expressions. Experiments show that ourapproach effectively understands diverse text prompts for pose generation,surpassing existing text-to-pose methods
引用
收藏
页数:13
相关论文
共 50 条
  • [21] A Universal Prompting Strategy for Extracting Process Model Information from Natural Language Text Using Large Language Models
    Neuberger, Julian
    Ackermann, Lars
    van der Aa, Han
    Jablonski, Stefan
    CONCEPTUAL MODELING, ER 2024, 2025, 15238 : 38 - 55
  • [22] Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language Models
    Nguyen, Minh
    Dernoncourt, Franck
    Yoon, Seunghyun
    Deilamsalehy, Hanieh
    Tana, Hao
    Rossi, Ryan
    Trani, Quan Hung
    Bui, Trung
    Nguyen, Thien Huu
    INTERSPEECH 2024, 2024, : 3799 - 3803
  • [23] ION: Navigating the HPC I/O Optimization Journey using Large Language Models
    Egersdoerfer, Chris
    Sareen, Arnav
    Bez, Jean Luca
    Byna, Suren
    Dai, Dong
    PROCEEDINGS OF THE 2024 16TH ACM WORKSHOP ON HOT TOPICS IN STORAGE AND FILE SYSTEMS, HOTSTORAGE 2024, 2024, : 86 - 92
  • [24] Generation of Breaking News Contents Using Large Language Models and Search Engine Optimization
    Pereira, João
    Wanzeller, Wenderson
    da Cruz, António Miguel Rosado
    International Conference on Enterprise Information Systems, ICEIS - Proceedings, 2024, 1 : 888 - 893
  • [25] Text-Based Prompt Injection Attack Using Mathematical Functions in Modern Large Language Models
    Kwon, Hyeokjin
    Pak, Wooguil
    ELECTRONICS, 2024, 13 (24):
  • [26] Mapping of specialized metabolite terms onto a plant phylogeny using text mining and large language models
    Busta, Lucas
    Hall, Drew
    Johnson, Braidon
    Schaut, Madelyn
    Hanson, Caroline M.
    Gupta, Anika
    Gundrum, Megan
    Wang, Yuer
    A. Maeda, Hiroshi
    PLANT JOURNAL, 2024, 120 (01): : 406 - 419
  • [27] Scalable information extraction from free text electronic health records using large language models
    Gu, Bowen
    Shao, Vivian
    Liao, Ziqian
    Carducci, Valentina
    Brufau, Santiago Romero
    Yang, Jie
    Desai, Rishi J.
    BMC MEDICAL RESEARCH METHODOLOGY, 2025, 25 (01)
  • [28] VoucherGPT: A Novel Approach for Personal Email Voucher Management Using Large Language Models
    Gupta, Sarang
    Jain, Niti
    2024 11TH IEEE SWISS CONFERENCE ON DATA SCIENCE, SDS 2024, 2024, : 167 - 173
  • [29] USING LARGE LANGUAGE MODELS (LLMS) FOR DATA EXTRACTION IN LITERATURE REVIEWS: AN ENHANCED APPROACH
    Lambova, A.
    Matev, K.
    Gallinaro, J.
    Guerra, I
    Rtveladze, K.
    Caverly, S.
    VALUE IN HEALTH, 2024, 27 (12)
  • [30] Battle of the Large Language Models: Dolly vs LLaMA vs Vicuna vs Guanaco vs Bard vs ChatGPT - A Text-to-SQL Parsing Comparison
    Sun, Shuo
    Zhang, Yuchen
    Yan, Jiahuan
    Gao, Yuze
    Ong, Donovan
    Chen, Bin
    Su, Jian
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 11225 - 11238