Flame3D enables zero-shot compositional 3D scene reasoning by representing scenes as editable visual-textual memories exposed to agentic MLLMs through composable and synthesizable spatial tools.
Ar surgical navigation with surface tracing: Comparing in-situ visualization with tool-tracking guidance for neurosurgical applications
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3roles
background 1polarities
background 1representative citing papers
A dual-tower 4D embodied world model called RoboStereo reduces geometric hallucinations and delivers over 97% relative improvement on manipulation tasks via test-time augmentation, imitative learning, and open exploration.
User study with 30 novices establishes performance baselines for freehand 5D AR trajectory following and shows orientation constraints create cognitive-motor trade-offs that some visual UIs can mitigate.
citing papers explorer
-
Flame3D: Zero-shot Compositional Reasoning of 3D Scenes with Agentic Language Models
Flame3D enables zero-shot compositional 3D scene reasoning by representing scenes as editable visual-textual memories exposed to agentic MLLMs through composable and synthesizable spatial tools.
-
RoboStereo: Dual-Tower 4D Embodied World Models for Unified Policy Optimization
A dual-tower 4D embodied world model called RoboStereo reduces geometric hallucinations and delivers over 97% relative improvement on manipulation tasks via test-time augmentation, imitative learning, and open exploration.
-
Hot Wire 5D+: Evaluating Cognitive and Motor Trade-offs of Visual Feedback for 5D Augmented Reality Trajectories
User study with 30 novices establishes performance baselines for freehand 5D AR trajectory following and shows orientation constraints create cognitive-motor trade-offs that some visual UIs can mitigate.