VideoPhy benchmark shows state-of-the-art text-to-video models follow physical commonsense and text prompts in only 39.6% of cases for the best model.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2024 2roles
background 1polarities
unclear 1representative citing papers
LIVE-GS uses an LLM to predict physical parameters from static Gaussian assets in 10 seconds for physics-aware VR interactions, validated by interviews, baseline comparisons, and user studies.
citing papers explorer
-
VideoPhy: Evaluating Physical Commonsense for Video Generation
VideoPhy benchmark shows state-of-the-art text-to-video models follow physical commonsense and text prompts in only 39.6% of cases for the best model.
-
LIVE-GS: LLM Powers Interactive VR Experience with Physics-Aware Gaussian Splatting
LIVE-GS uses an LLM to predict physical parameters from static Gaussian assets in 10 seconds for physics-aware VR interactions, validated by interviews, baseline comparisons, and user studies.