Incantation is the first video world model to use per-frame natural language conditioning for simultaneous multi-entity control and concept-level cross-entity transfer in interactive video generation.
Zero-shot conditioning of score-based diffusion models by neuro-symbolic constraints
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
STRELGen combines a multi-agent diffusion model with differentiable STREL specifications to optimize latent space for generating plausible yet safety-critical driving scenarios.
citing papers explorer
-
Incantation: Natural Language as the Action Interface for Multi-Entity Video World Models
Incantation is the first video world model to use per-frame natural language conditioning for simultaneous multi-entity control and concept-level cross-entity transfer in interactive video generation.
-
Guiding Neuro-Symbolic Scenario Generation with Spatio-Temporal Logic
STRELGen combines a multi-agent diffusion model with differentiable STREL specifications to optimize latent space for generating plausible yet safety-critical driving scenarios.