SceneParser introduces hierarchical scene parsing as object-part-affordance chains, a VLM trained with pseudo labels and curriculum learning, and SceneParser-Bench with 1.74M affordance annotations, showing better structure-aware results than existing MLLMs.
arXiv preprint arXiv:2512.14442 (2025)
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
Affordance Agent Harness is a verification-gated orchestration system that unifies skills via an evidence store, episodic memory priors, an adaptive router, and a self-consistency verifier to improve accuracy-cost tradeoffs in open-world affordance grounding.
citing papers explorer
-
SceneParser: Hierarchical Scene Parsing for Visual Semantics Understanding
SceneParser introduces hierarchical scene parsing as object-part-affordance chains, a VLM trained with pseudo labels and curriculum learning, and SceneParser-Bench with 1.74M affordance annotations, showing better structure-aware results than existing MLLMs.
-
Affordance Agent Harness: Verification-Gated Skill Orchestration
Affordance Agent Harness is a verification-gated orchestration system that unifies skills via an evidence store, episodic memory priors, an adaptive router, and a self-consistency verifier to improve accuracy-cost tradeoffs in open-world affordance grounding.