PRISM benchmark of over 10k pairs shows LLMs have a 41% average drop from code execution success to spatial correctness in programmatic video generation.
arXiv preprint arXiv:2510.01174 (2025)
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 7verdicts
UNVERDICTED 7roles
background 2polarities
background 2representative citing papers
Proposes a levels x laws taxonomy for world models in AI agents, defining L1-L3 capabilities across physical, digital, social, and scientific regimes while reviewing over 400 works to outline a roadmap for advanced agentic modeling.
EvoDiagram uses a coordinated multi-agent system and design knowledge evolution to generate editable diagrams via canvas schema, with a new CanvasBench benchmark showing strong performance over baselines.
OmniManim improves render quality in educational animation code generation by using a Vision Agent with coarse-to-fine bounding-box denoising and interpolation-aware optimization on new datasets.
Anchor-Centric Adaptation escapes the diversity trap by prioritizing repeated demonstrations at core anchors over broad coverage, yielding higher success rates under fixed data budgets in robotic manipulation.
ANVIL automates analogy-based instructional animations for computer science by chaining LLM analogy generation, screenplay structuring, manim code production with repair, and mixed human-automated evaluations.
LLM2Manim pipeline generates pedagogy-aware Manim animations for STEM, producing slightly better student post-test scores (83% vs 78%), learning gains (d=0.67), and engagement than PowerPoint in a controlled study.
citing papers explorer
-
PRISM: A Benchmark for Programmatic Spatial-Temporal Reasoning
PRISM benchmark of over 10k pairs shows LLMs have a 41% average drop from code execution success to spatial correctness in programmatic video generation.
-
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond
Proposes a levels x laws taxonomy for world models in AI agents, defining L1-L3 capabilities across physical, digital, social, and scientific regimes while reviewing over 400 works to outline a roadmap for advanced agentic modeling.
-
EvoDiagram: Agentic Editable Diagram Creation via Design Expertise Evolution
EvoDiagram uses a coordinated multi-agent system and design knowledge evolution to generate editable diagrams via canvas schema, with a new CanvasBench benchmark showing strong performance over baselines.
-
See Before You Code: Learning Visual Priors for Spatially Aware Educational Animation Generation
OmniManim improves render quality in educational animation code generation by using a Vision Agent with coarse-to-fine bounding-box denoising and interpolation-aware optimization on new datasets.
-
Escaping the Diversity Trap in Robotic Manipulation via Anchor-Centric Adaptation
Anchor-Centric Adaptation escapes the diversity trap by prioritizing repeated demonstrations at core anchors over broad coverage, yielding higher success rates under fixed data budgets in robotic manipulation.
-
ANVIL: Analogies and Videos for Lecturers
ANVIL automates analogy-based instructional animations for computer science by chaining LLM analogy generation, screenplay structuring, manim code production with repair, and mixed human-automated evaluations.
-
LLM2Manim: Pedagogy-Aware AI Generation of STEM Animations
LLM2Manim pipeline generates pedagogy-aware Manim animations for STEM, producing slightly better student post-test scores (83% vs 78%), learning gains (d=0.67), and engagement than PowerPoint in a controlled study.