{"paper":{"title":"Stylized Text-to-Motion Generation via Hypernetwork-Driven Low-Rank Adaptation","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"A hypernetwork maps style embeddings from reference motions to LoRA parameters that modulate a pretrained text-to-motion diffusion model at every denoising step.","cross_cats":["cs.AI","cs.GR","cs.LG"],"primary_cat":"cs.CV","authors_text":"Junhyuk Jeon, Junyong Noh, Seokhyeon Hong","submitted_at":"2026-05-13T10:51:54Z","abstract_excerpt":"Text-driven motion diffusion models are capable of generating realistic human motions, but text alone often struggles to express fine-level nuances of motion, commonly referred to as style. Recent approaches have tackled this challenge by attaching a style injection mechanism to a pretrained text-driven diffusion model. Existing stylization methods, however, either require style-specific fine-tuning of existing models or rely on heavy ControlNet-based architectures, limiting efficiency and generalization to unseen styles. We propose a lightweight style conditioning framework that dynamically m"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"We propose a lightweight style conditioning framework that dynamically modulates a pretrained diffusion model through hypernetwork-generated LoRA parameters.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That a hypernetwork can map global style embeddings to effective low-rank updates applied at every denoising step while preserving text alignment and generalizing to unseen styles without post-hoc tuning or loss of motion quality.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"A hypernetwork maps style motion embeddings to LoRA updates that stylize text-driven motion diffusion models with improved generalization to unseen styles via contrastive structuring of the style space.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"A hypernetwork maps style embeddings from reference motions to LoRA parameters that modulate a pretrained text-to-motion diffusion model at every denoising step.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"84acc205979ade4b928145a9bf873fff7a5a80e44a2897c6d0ec4a4ee05960be"},"source":{"id":"2605.13333","kind":"arxiv","version":1},"verdict":{"id":"7de1bac9-0259-41a8-ac16-a7193fa3b2cf","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-14T19:27:46.503807Z","strongest_claim":"We propose a lightweight style conditioning framework that dynamically modulates a pretrained diffusion model through hypernetwork-generated LoRA parameters.","one_line_summary":"A hypernetwork maps style motion embeddings to LoRA updates that stylize text-driven motion diffusion models with improved generalization to unseen styles via contrastive structuring of the style space.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That a hypernetwork can map global style embeddings to effective low-rank updates applied at every denoising step while preserving text alignment and generalizing to unseen styles without post-hoc tuning or loss of motion quality.","pith_extraction_headline":"A hypernetwork maps style embeddings from reference motions to LoRA parameters that modulate a pretrained text-to-motion diffusion model at every denoising step."},"references":{"count":46,"sample":[{"doi":"","year":2024,"title":"European Conference on Computer Vision , pages=","work_id":"f7b44aaa-5b10-4d0f-8dda-36b185413ee9","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2025,"title":"Computer Graphics Forum , pages=","work_id":"78593635-0a31-472b-baaa-01a190356eef","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers , pages=","work_id":"1cb0a99b-73fb-4b1d-8910-c9469630b10f","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Proceedings of the IEEE/CVF international conference on computer vision , pages=","work_id":"09b6cf05-a740-4772-9a0b-9e112a202776","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Lora: Low-rank adaptation of large language models. , author=. ICLR , volume=","work_id":"78d5b8e1-1042-4222-9246-42cfa74bf034","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":46,"snapshot_sha256":"f9585331adfda1fa46cbf8d8a0a80f84b7ab75adc3304d0bab27a8d31470f014","internal_anchors":5},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}