InteractScience is a new benchmark for evaluating LLMs on generating interactive scientific demonstration code via programmatic functional tests and visually-grounded qualitative checks.
### Output Format (strictly follow this structure, no extra commentary or code):
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SE 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
InteractScience: Programmatic and Visually-Grounded Evaluation of Interactive Scientific Demonstration Code Generation
InteractScience is a new benchmark for evaluating LLMs on generating interactive scientific demonstration code via programmatic functional tests and visually-grounded qualitative checks.