ST-BiBench reveals a coordination paradox in which MLLMs show strong high-level strategic reasoning yet fail at fine-grained 16-dimensional bimanual action synthesis and multi-stream fusion.
Photodoodle: Learning artistic image editing from few-shot pairwise data
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
OmniHumanoid factorizes transferable motion learning from embodiment-specific adaptation to enable scalable cross-embodiment video generation without paired data for new humanoids.
citing papers explorer
-
ST-BiBench: Benchmarking Multi-Stream Multimodal Coordination in Bimanual Embodied Tasks for MLLMs
ST-BiBench reveals a coordination paradox in which MLLMs show strong high-level strategic reasoning yet fail at fine-grained 16-dimensional bimanual action synthesis and multi-stream fusion.
-
OmniHumanoid: Streaming Cross-Embodiment Video Generation with Paired-Free Adaptation
OmniHumanoid factorizes transferable motion learning from embodiment-specific adaptation to enable scalable cross-embodiment video generation without paired data for new humanoids.