BackTranslation2.0 is a linguistically motivated evaluation metric for sign language production that uses an agentic tool pipeline and LLM cross-referencing to score four dimensions and shows strong human correlation on a BSL dataset.
SignSparK: Efficient Multilingual Sign Language Production via Sparse Keyframe Learning
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
Sign Language Production (SLP) faces a fundamental trade-off: direct text-to-pose models suffer from regression-to-the-mean effects, while dictionary-retrieval methods produce disjointed transitions. To resolve this, we propose a novel training paradigm that leverages sparse keyframes to capture the underlying kinematic distribution of human signing. By generating dense motion from discrete anchors, our approach mitigates regression-to-the-mean while ensuring fluid articulation. To achieve this at scale, we introduce FAST, an ultra-efficient sign segmentation model that automatically mines precise temporal boundaries. We then present SignSparK, a Conditional Flow Matching (CFM) framework that utilizes these temporal anchors to synthesize 3D signing sequences. This keyframe-driven formulation also unlocks Keyframe-to-Pose (KF2P) generation, making precise spatiotemporal editing of signing sequences possible. Furthermore, SignSparK scales across four distinct sign languages, constituting the largest multilingual SLP framework to date, and integrates 3D Gaussian Splatting for photorealistic rendering. Extensive evaluations demonstrate that SignSparK achieves state-of-the-art across diverse SLP tasks and multilingual benchmarks. Our code is available at https://github.com/JianHe0628/SignSparK.
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
BackTranslation2.0 -- A Linguistically Motivated Metric to Assess Sign Language Production
BackTranslation2.0 is a linguistically motivated evaluation metric for sign language production that uses an agentic tool pipeline and LLM cross-referencing to score four dimensions and shows strong human correlation on a BSL dataset.