pith. sign in

hub Canonical reference

Geovla: Empowering 3d representa- tions in vision-language-action models

Canonical reference. 89% of citing Pith papers cite this work as background.

15 Pith papers citing it
Background 89% of classified citations

hub tools

citation-role summary

background 8 baseline 1

citation-polarity summary

fields

cs.RO 12 cs.CV 3

years

2026 13 2025 2

representative citing papers

A Pragmatic VLA Foundation Model

cs.RO · 2026-01-26 · unverdicted · novelty 6.0

LingBot-VLA is a VLA foundation model trained on massive real robot data that shows superior generalization across tasks and platforms with fast training throughput.

R3D: Revisiting 3D Policy Learning

cs.CV · 2026-04-16 · unverdicted · novelty 5.0

A transformer 3D encoder plus diffusion decoder architecture, with 3D-specific augmentations, outperforms prior 3D policy methods on manipulation benchmarks by improving training stability.

CoEnv: Driving Embodied Multi-Agent Collaboration via Compositional Environment

cs.RO · 2026-04-07 · unverdicted · novelty 5.0

CoEnv introduces a compositional environment that integrates real and simulated spaces for multi-agent robotic collaboration, using real-to-sim reconstruction, VLM action synthesis, and validated sim-to-real transfer to achieve high success rates on multi-arm manipulation tasks.

citing papers explorer

Showing 15 of 15 citing papers.