Incantation is the first video world model to use per-frame natural language conditioning for simultaneous multi-entity control and concept-level cross-entity transfer in interactive video generation.
Retrieval-augmented generation for knowledge-intensive NLP tasks
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
EmoMind is the first end-to-end pipeline that decodes continuous affective captions from fMRI by combining brain-decoded visual features with a 34D emotion vector and classifier-free guidance to balance semantic fidelity and affective expressivity.
citing papers explorer
-
Incantation: Natural Language as the Action Interface for Multi-Entity Video World Models
Incantation is the first video world model to use per-frame natural language conditioning for simultaneous multi-entity control and concept-level cross-entity transfer in interactive video generation.
-
EmoMind: Decoding Affective Captions from Human Brain fMRI
EmoMind is the first end-to-end pipeline that decodes continuous affective captions from fMRI by combining brain-decoded visual features with a 34D emotion vector and classifier-free guidance to balance semantic fidelity and affective expressivity.