Orient anything: Learning robust object orientation estimation from rendering 3d mod- els

Zehan Wang, Ziang Zhang, Tianyu Pang, Chao Du, Hengshuang Zhao, Zhou Zhao · 2024 · arXiv 2412.18605

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

ChopGrad: Pixel-Wise Losses for Latent Video Diffusion via Truncated Backpropagation

cs.CV · 2026-03-18 · unverdicted · novelty 7.0

ChopGrad truncates backpropagation to local frame windows in video diffusion models, reducing memory from linear in frame count to constant while enabling pixel-wise loss fine-tuning.

GenHSI: Controllable Generation of Human-Scene Interaction Videos

cs.CV · 2025-06-24 · unverdicted · novelty 7.0

GenHSI is a training-free three-stage pipeline that turns a scene image, character image, and complex HSI prompt into long videos with plausible chained interactions by generating atomic actions, 3D keyframes via 2D inpainting plus optimization, and then feeding them to pre-trained video diffusion.

Unposed-to-3D: Learning Simulation-Ready Vehicles from Real-World Images

cs.CV · 2026-04-21 · unverdicted · novelty 5.0

Unposed-to-3D learns simulation-ready 3D vehicle models from unposed real images by predicting camera parameters for photometric self-supervision, then adding scale prediction and harmonization.

citing papers explorer

Showing 3 of 3 citing papers.

ChopGrad: Pixel-Wise Losses for Latent Video Diffusion via Truncated Backpropagation cs.CV · 2026-03-18 · unverdicted · none · ref 59
ChopGrad truncates backpropagation to local frame windows in video diffusion models, reducing memory from linear in frame count to constant while enabling pixel-wise loss fine-tuning.
GenHSI: Controllable Generation of Human-Scene Interaction Videos cs.CV · 2025-06-24 · unverdicted · none · ref 87
GenHSI is a training-free three-stage pipeline that turns a scene image, character image, and complex HSI prompt into long videos with plausible chained interactions by generating atomic actions, 3D keyframes via 2D inpainting plus optimization, and then feeding them to pre-trained video diffusion.
Unposed-to-3D: Learning Simulation-Ready Vehicles from Real-World Images cs.CV · 2026-04-21 · unverdicted · none · ref 53
Unposed-to-3D learns simulation-ready 3D vehicle models from unposed real images by predicting camera parameters for photometric self-supervision, then adding scale prediction and harmonization.

Orient anything: Learning robust object orientation estimation from rendering 3d mod- els

fields

years

verdicts

representative citing papers

citing papers explorer