pith. sign in

hub Mixed citations

Vggt: Visual geometry grounded transformer

Mixed citation behavior. Most common role is method (57%).

23 Pith papers citing it
150 external citations · external index
Method 57% of classified citations

hub tools

citation-role summary

method 4 background 2 baseline 1

citation-polarity summary

years

2026 23

representative citing papers

PIXLRelight: Controllable Relighting via Intrinsic Conditioning

cs.CV · 2026-05-18 · unverdicted · novelty 6.0

A transformer-based neural renderer that transfers arbitrary PBR lighting to single images via shared intrinsic conditioning extracted from both multi-illumination photos and path-traced coarse 3D renders.

Unlocking Dense Metric Depth Estimation in VLMs

cs.CV · 2026-05-15 · unverdicted · novelty 6.0 · 2 refs

DepthVLM converts a standard VLM into a dense metric depth predictor by attaching a lightweight head and training under unified vision-text supervision, outperforming prior VLMs and some pure vision models on a new indoor-outdoor benchmark.

CoWorld-VLA: Thinking in a Multi-Expert World Model for Autonomous Driving

cs.CV · 2026-05-11 · unverdicted · novelty 6.0 · 2 refs

CoWorld-VLA extracts semantic, geometric, dynamic, and trajectory expert tokens from multi-source supervision and feeds them into a diffusion-based hierarchical planner, achieving competitive collision avoidance and trajectory accuracy on the NAVSIM v1 benchmark.

Robust 4D Visual Geometry Transformer with Uncertainty-Aware Priors

cs.CV · 2026-04-10 · unverdicted · novelty 6.0

The Robust 4D Visual Geometry Transformer with Uncertainty-Aware Priors outperforms prior methods on dynamic benchmarks by cutting Mean Accuracy error 13.43% and raising segmentation F-measure 10.49% via three uncertainty mechanisms while keeping feed-forward speed.

HD-VGGT: High-Resolution Visual Geometry Transformer

cs.CV · 2026-03-28 · unverdicted · novelty 6.0

HD-VGGT achieves state-of-the-art high-resolution 3D reconstruction from image collections via a dual-branch architecture that predicts coarse geometry at low resolution and refines details at high resolution while modulating unreliable features.

Context Unrolling in Omni Models

cs.CV · 2026-04-23 · unverdicted · novelty 5.0

Omni is a multimodal model whose native training on diverse data types enables context unrolling, allowing explicit reasoning across modalities to better approximate shared knowledge and improve downstream performance.

citing papers explorer

Showing 23 of 23 citing papers.