CPoser: An Optimization-after-Parsing Approach for Text-to-PoseGeneration Using Large Language Models

被引:0
|
作者
Li, Yumeng [1 ]
Chen, Bohong [1 ]
Ren, Zhong [1 ]
Ding, Yao-xiang [1 ]
Liu, Libin [2 ]
Shao, Tianjia [1 ]
Zhou, Kun [1 ]
机构
[1] Zhejiang Univ, State Key Lab CAD&CG, Hangzhou, Peoples R China
[2] Peking Univ, State Key Lab Gen AI, Beijing, Peoples R China
来源
ACM TRANSACTIONS ON GRAPHICS | 2024年 / 43卷 / 06期
关键词
Human posture; text-to-pose generation; zero-shot learning; pose priors; large language models;
D O I
10.1145/3687932
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Text-to-pose generation is challenging due to the complexity of naturallanguage and human posture semantics. Utilizing large language models(LLMs) for text-to-pose generation is appealing due to their strong capabili-ties in text understanding and reasoning. However, as LLMs are designed forgeneral-purpose language processing and not specifically trained for posegeneration, it remains nontrivial to generate precise articulation targets for the full body using LLMs directly. To this end, we propose CPoser, a novelapproach to harness the power of LLMs for text-to-pose generation, featur-ing a prompt parsing stage and a pose optimization stage. The parsing stageutilizes LLMs to turn text prompts into pose intermediate representations(Pose-IRs) through a set of predefined structured queries. These Pose-IRsexplicitly describe specific pose conditions, such as squatting depth and kneebending angle, naturally forming an objective function that a target poseshould satisfy. The optimization stage solves for expressive poses and handgestures based on the Pose-IR objective function via robust optimizationin a quantized pose prior space. The results are further refined to enhancenaturalness and incorporate facial expressions. Experiments show that ourapproach effectively understands diverse text prompts for pose generation,surpassing existing text-to-pose methods
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Using Large Language Models to Detect Depression From User-Generated Diary Text Data as a Novel Approach in Digital Mental Health Screening: Instrument Validation Study
    Shin, Daun
    Kim, Hyoseung
    Lee, Seunghwan
    Cho, Younhee
    Jung, Whanbo
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
  • [42] Knowledge Enhancement and Optimization Strategies for Remote Sensing Image Captioning Using Contrastive Language Image Pre-training and Large Language Models
    Wang, Xinren
    Wan, Tengfei
    Song, Jianning
    Huang, Jingmeng
    PROCEEDINGS OF 2024 INTERNATIONAL CONFERENCE ON MACHINE INTELLIGENCE AND DIGITAL APPLICATIONS, MIDA2024, 2024, : 313 - 318
  • [43] A Novel Approach to Rental Market Analysis for Property Management Firms Using Large Language Models and Machine Learning
    Naushad, Raoof
    Gupta, Rakshit
    Bhutiyal, Tejasvi
    Prajapati, Vrushali
    ROUGH SETS, PT II, IJCRS 2024, 2024, 14840 : 247 - 261
  • [44] Needs Companion: A Novel Approach to Continuous User Needs Sensing Using Virtual Agents and Large Language Models
    Nakata, Takuya
    Nakamura, Masahide
    Chen, Sinan
    Saiki, Sachio
    SENSORS, 2024, 24 (21)
  • [45] A Novel Approach for Mixed-Methods Research Using Large Language Models: A Report Using Patients' Perspectives on Barriers to Arthroplasty
    Mannstadt, Insa
    Goodman, Susan M.
    Rajan, Mangala
    Young, Sarah R.
    Wang, Fei
    Navarro-Millan, Iris
    Mehta, Bella
    ACR OPEN RHEUMATOLOGY, 2024, 6 (06) : 375 - 379
  • [46] Large-scale text analysis using generative language models: A case study in discovering public value expressions in AI patents
    Pelaez, Sergio
    Verma, Gaurav
    Ribeiro, Barbara
    Shapira, Philip
    QUANTITATIVE SCIENCE STUDIES, 2024, 5 (01): : 153 - 169
  • [47] Can Large Language Models Fix Data Annotation Errors? An Empirical Study Using Debatepedia for Query-Focused Text Summarization
    Laskar, Md Tahmid Rahman
    Rahman, Mizanur
    Jahan, Israt
    Hoque, Enamul
    Huang, Jimmy Xiangji
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 10245 - 10255
  • [48] P-TA: Using Proximal Policy Optimization to Enhance Tabular Data Augmentation via Large Language Models
    Yang, Shuo
    Yuan, Chenchen
    Rong, Yao
    Steinbauer, Felix
    Kasneci, Gjergji
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 248 - 264
  • [49] CLASSIFICATION OF ADVERSE EVENTS AFTER PROSTATE CANCER HYDROGEL PERIRECTAL SPACER INSERTION USING LARGE LANGUAGE MODELS
    Nishan, Sohoni
    Nimit, S. Sohoni
    Ryan, A. Sutherland
    Vinaik, M. Sundaresan
    Julia, E. Olivieri
    Michael, S. Leapman
    UROLOGIC ONCOLOGY-SEMINARS AND ORIGINAL INVESTIGATIONS, 2025, 43 (03)
  • [50] An Accurate and Efficient Approach to Knowledge Extraction from Scientific Publications Using Structured Ontology Models, Graph Neural Networks, and Large Language Models
    Ivanisenko, Timofey V.
    Demenkov, Pavel S.
    Ivanisenko, Vladimir A.
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (21)