pith. sign in

arxiv: 2511.21029 · v3 · pith:S4CH4BEMnew · submitted 2025-11-26 · 💻 cs.CV

FlowerDance: MeanFlow for Efficient and Refined 3D Dance Generation

classification 💻 cs.CV
keywords flowerdancegenerationmotiondanceefficiencyefficientachievesapplications
0
0 comments X
read the original abstract

Music-to-dance generation aims to translate auditory signals into expressive human motion, with broad applications in virtual reality, choreography, and digital entertainment. Despite promising progress, the limited generation efficiency of existing methods leaves insufficient computational headroom for high-fidelity 3D rendering, thereby constraining the expressiveness of 3D characters during real-world applications. Thus, we propose FlowerDance, which not only generates refined motion with physical plausibility and artistic expressiveness, but also achieves significant generation efficiency on inference speed and memory utilization. Specifically, FlowerDance combines MeanFlow with Physical Consistency Constraints, which enables high-quality motion generation with only a few sampling steps. Moreover, FlowerDance leverages a simple but efficient model architecture with BiMamba-based backbone and Channel-Level Cross-Modal Fusion, which generates dance with efficient non-autoregressive manner. Meanwhile, FlowerDance supports motion editing, enabling users to interactively refine dance sequences. Extensive experiments on AIST++ and FineDance show that FlowerDance achieves state-of-the-art results in both motion quality and generation efficiency. Code will be released upon acceptance.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Interactive Multi-Turn Retrieval for Health Videos

    cs.IR 2026-05 unverdicted novelty 6.0

    DATR combines coarse CLIP-based retrieval with multi-turn query fusion and cross-encoder re-ranking to improve health video retrieval, supported by the new MHVRC corpus.

  2. CustomDancer: Customized Dance Recommendation by Text-Dance Retrieval

    cs.MM 2026-05 unverdicted novelty 6.0

    CustomDancer achieves state-of-the-art text-to-dance retrieval with 10.23% Recall@1 on the new TD-Data dataset by aligning text, music, and motion features through a CLIP-based framework.

  3. MG-Former: A Transformer-Based Framework for Music-Driven 3D Conducting Gesture Generation

    cs.SD 2026-05 unverdicted novelty 5.0

    TransConductor generates 3D conducting gestures from music via a Trans-Temporal Music Encoder and Gesture Decoder, outperforming baselines on retrieval-based alignment metrics with a new ConductorMotion dataset.