AGILE generates complete object meshes via VLM-guided synthesis and tracks poses with anchor-and-track plus contact-aware optimization to achieve robust hand-object reconstruction from video.
Zero-1-to- 3: Zero-shot one image to 3d object
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 3verdicts
UNVERDICTED 3representative citing papers
HAD uses multi-view reasoning from a pre-trained feedforward NVS network to estimate and mask hallucination scores in diffusion priors, reducing artifacts and achieving SOTA novel view synthesis in sparse-view 3D reconstruction.
A technique reconstructs large urban areas from sparse extreme off-nadir satellite images by modeling geometry as a Z-monotonic 2.5D height map SDF and applying a generative network to restore plausible textures on the resulting mesh.
citing papers explorer
-
AGILE: Hand-Object Interaction Reconstruction from Video via Agentic Generation
AGILE generates complete object meshes via VLM-guided synthesis and tracks poses with anchor-and-track plus contact-aware optimization to achieve robust hand-object reconstruction from video.
-
HAD: Hallucination-Aware Diffusion Priors for 3D Reconstruction
HAD uses multi-view reasoning from a pre-trained feedforward NVS network to estimate and mask hallucination scores in diffusion priors, reducing artifacts and achieving SOTA novel view synthesis in sparse-view 3D reconstruction.
-
From Orbit to Ground: Generative City Photogrammetry from Extreme Off-Nadir Satellite Images
A technique reconstructs large urban areas from sparse extreme off-nadir satellite images by modeling geometry as a Z-monotonic 2.5D height map SDF and applying a generative network to restore plausible textures on the resulting mesh.