hub Canonical reference

TripoSR: Fast 3D Object Reconstruction from a Single Image

Dmitry Tochilkin, David Pankratz, Zexiang Liu, Zixuan Huang, Adam Letts, Yangguang Li · 2024 · cs.CV · arXiv 2403.02151

Canonical reference. 78% of citing Pith papers cite this work as background.

24 Pith papers citing it

Background 78% of classified citations

open full Pith review browse 24 citing papers arXiv PDF

abstract

This technical report introduces TripoSR, a 3D reconstruction model leveraging transformer architecture for fast feed-forward 3D generation, producing 3D mesh from a single image in under 0.5 seconds. Building upon the LRM network architecture, TripoSR integrates substantial improvements in data processing, model design, and training techniques. Evaluations on public datasets show that TripoSR exhibits superior performance, both quantitatively and qualitatively, compared to other open-source alternatives. Released under the MIT license, TripoSR is intended to empower researchers, developers, and creatives with the latest advancements in 3D generative AI.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 8 baseline 1

citation-polarity summary

background 7 baseline 1 unclear 1

representative citing papers

R-DMesh: Video-Guided 3D Animation via Rectified Dynamic Mesh Flow

cs.CV · 2026-05-13 · unverdicted · novelty 7.0 · 2 refs

R-DMesh generates high-fidelity 4D meshes aligned to video by disentangling base mesh, motion, and a learned rectification jump offset inside a VAE, then using Triflow Attention and rectified-flow diffusion.

Img2CADSeq: Image-to-CAD Generation via Sequence-Based Diffusion

cs.CV · 2026-05-13 · unverdicted · novelty 7.0

Img2CADSeq generates standard CAD sequences from images via a multi-stage pipeline with three-level hierarchical codebook encoding, importance-guided compression, and contrastive point-cloud conditioning of a VQ-Diffusion model, outperforming prior methods on new CAD-220K and PrintCAD datasets.

Towards Realistic and Consistent Orbital Video Generation via 3D Foundation Priors

cs.CV · 2026-04-14 · unverdicted · novelty 7.0

A video generation approach conditions a base model with multi-scale 3D latent features and a cross-attention adapter to produce geometrically realistic and consistent orbital videos from one image.

AniGen: Unified $S^3$ Fields for Animatable 3D Asset Generation

cs.GR · 2026-04-09 · unverdicted · novelty 7.0

AniGen directly generates animatable 3D assets with consistent shape, skeleton, and skinning from single images using unified S^3 fields and a two-stage flow-matching pipeline.

Benchmarking Vision-Language Models under Contradictory Virtual Content Attacks in Augmented Reality

cs.CV · 2026-04-07 · unverdicted · novelty 7.0

ContrAR benchmark reveals that current VLMs show reasonable understanding of contradictory virtual content in AR but need improvement in detection, reasoning, and balancing accuracy with latency.

CARI4D: Category Agnostic 4D Reconstruction of Human-Object Interaction

cs.CV · 2025-12-12 · unverdicted · novelty 7.0

CARI4D is the first category-agnostic pipeline that produces metric-scale, spatially and temporally consistent 4D reconstructions of human-object interactions from monocular RGB videos via foundation-model hypothesis selection, render-and-compare refinement, and contact reasoning.

SVG360: Editable Multiview Vector Graphics from a Single SVG

cs.CV · 2025-11-20 · unverdicted · novelty 7.0

SVG360 lifts a single SVG to a view-conditioned representation, uses spatial memory to propagate consistent parts across views, and applies structure-aware vectorization to produce editable multiview SVGs.

Structured 3D Latents for Scalable and Versatile 3D Generation

cs.CV · 2024-12-02 · unverdicted · novelty 7.0

SLAT provides a unified 3D latent representation enabling versatile high-quality generation across multiple output formats from text or image inputs.

GeoFlow: Enforcing Implicit Geometric Consistency in Video Generation

cs.CV · 2026-05-18 · unverdicted · novelty 6.0

GeoFlow adds a geometry-consistency reward based on rigid camera flow and object appearance preservation, integrated via reinforcement fine-tuning to improve geometric coherence in video generation.

PanoWorld: A Generative Spatial World Model for Consistent Whole-House Panorama Synthesis

cs.CV · 2026-05-18 · unverdicted · novelty 6.0

PanoWorld autoregressively generates consistent multi-room 360-degree panoramas for whole-house VR using a floorplan-derived 3D shell as geometric proxy and a dynamic 3DGS cache for spatial memory.

Sat3DGen: Comprehensive Street-Level 3D Scene Generation from Single Satellite Image

cs.CV · 2026-05-14 · unverdicted · novelty 6.0

Sat3DGen improves geometric RMSE from 6.76m to 5.20m and FID from ~40 to 19 for street-level 3D generation from satellite images via geometry-centric constraints and perspective training.

Toward Visually Realistic Simulation: A Benchmark for Evaluating Robot Manipulation in Simulation

cs.RO · 2026-05-07 · unverdicted · novelty 6.0

VISER is a new visually realistic simulation benchmark for robot manipulation tasks that uses PBR materials and MLLM-assisted asset generation, achieving 0.92 Pearson correlation with real-world policy performance.

Prop-Chromeleon: Adaptive Haptic Props in Mixed Reality through Generative Artificial Intelligence

cs.HC · 2026-05-01 · unverdicted · novelty 6.0

A generative-AI pipeline dynamically generates and anchors virtual assets to match the shape of physical props, enabling adaptive passive haptics in MR that users rate higher in realism, immersion, and enjoyment than static baselines.

Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective

cs.CV · 2026-04-15 · unverdicted · novelty 6.0

The paper proposes a problem-driven taxonomy for feed-forward 3D scene modeling that groups methods by five core challenges: feature enhancement, geometry awareness, model efficiency, augmentation strategies, and temporal-aware modeling.

Lyra 2.0: Explorable Generative 3D Worlds

cs.CV · 2026-04-14 · unverdicted · novelty 6.0

Lyra 2.0 produces persistent 3D-consistent video sequences for large explorable worlds by using per-frame geometry for information routing and self-augmented training to correct temporal drift.

A Semi-Automated Framework for 3D Reconstruction of Medieval Manuscript Miniatures

cs.CV · 2026-04-08 · conditional · novelty 6.0

A pipeline using SAM segmentation and Hi3DGen mesh generation, evaluated on 69 medieval figures, produces usable 3D models for XR and tactile applications with Hi3DGen as the best starting point.

TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models

cs.CV · 2025-02-10 · unverdicted · novelty 6.0

TripoSG generates high-fidelity 3D meshes from input images via a large-scale rectified flow transformer and hybrid-trained 3D VAE on a custom 2-million-sample dataset, claiming state-of-the-art fidelity and generalization.

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models

cs.CV · 2024-04-10 · unverdicted · novelty 6.0

InstantMesh produces diverse, high-quality 3D meshes from single images in seconds by combining a multi-view diffusion model with a sparse-view large reconstruction model and optimizing directly on meshes.

From Visual Synthesis to Interactive Worlds: Toward Production-Ready 3D Asset Generation

cs.GR · 2026-04-26 · unverdicted · novelty 5.0 · 2 refs

The paper surveys 3D asset generation methods and organizes them around the full production pipeline to assess which outputs meet engine-level requirements for interactive applications.

AmaraSpatial-10K: A Spatially and Semantically Aligned 3D Dataset for Spatial Computing and Embodied AI

cs.CV · 2026-04-24 · unverdicted · novelty 5.0 · 2 refs

AmaraSpatial-10K supplies 10K deployment-ready 3D assets with metric scaling and metadata, delivering 3.4x higher CLIP Recall@5 than Objaverse and 99.1% physics stability in Habitat-Sim.

UniMesh: Unifying 3D Mesh Understanding and Generation

cs.CV · 2026-04-19 · unverdicted · novelty 5.0

UniMesh unifies 3D mesh generation and understanding in one model via a Mesh Head interface, Chain of Mesh iterative editing, and an Actor-Evaluator self-reflection loop.

FUNCanon: Learning Pose-Aware Action Primitives via Functional Object Canonicalization for Generalizable Robotic Manipulation

cs.RO · 2025-09-23 · unverdicted · novelty 5.0

FunCanon introduces functional object canonicalization with VLM affordances to create pose-aware action primitives for generalizable imitation learning in robotic manipulation.

Stream3D: Sequential Multi-View 3D Generation via Evidential Memory

cs.CV · 2026-05-20

OpenWorldLib: A Unified Codebase and Definition of Advanced World Models

cs.CV · 2026-04-06

citing papers explorer

Showing 24 of 24 citing papers.

R-DMesh: Video-Guided 3D Animation via Rectified Dynamic Mesh Flow cs.CV · 2026-05-13 · unverdicted · none · ref 206 · 2 links · internal anchor
R-DMesh generates high-fidelity 4D meshes aligned to video by disentangling base mesh, motion, and a learned rectification jump offset inside a VAE, then using Triflow Attention and rectified-flow diffusion.
Img2CADSeq: Image-to-CAD Generation via Sequence-Based Diffusion cs.CV · 2026-05-13 · unverdicted · none · ref 20 · internal anchor
Img2CADSeq generates standard CAD sequences from images via a multi-stage pipeline with three-level hierarchical codebook encoding, importance-guided compression, and contrastive point-cloud conditioning of a VQ-Diffusion model, outperforming prior methods on new CAD-220K and PrintCAD datasets.
Towards Realistic and Consistent Orbital Video Generation via 3D Foundation Priors cs.CV · 2026-04-14 · unverdicted · none · ref 24 · internal anchor
A video generation approach conditions a base model with multi-scale 3D latent features and a cross-attention adapter to produce geometrically realistic and consistent orbital videos from one image.
AniGen: Unified $S^3$ Fields for Animatable 3D Asset Generation cs.GR · 2026-04-09 · unverdicted · none · ref 27 · internal anchor
AniGen directly generates animatable 3D assets with consistent shape, skeleton, and skinning from single images using unified S^3 fields and a two-stage flow-matching pipeline.
Benchmarking Vision-Language Models under Contradictory Virtual Content Attacks in Augmented Reality cs.CV · 2026-04-07 · unverdicted · none · ref 36 · internal anchor
ContrAR benchmark reveals that current VLMs show reasonable understanding of contradictory virtual content in AR but need improvement in detection, reasoning, and balancing accuracy with latency.
CARI4D: Category Agnostic 4D Reconstruction of Human-Object Interaction cs.CV · 2025-12-12 · unverdicted · none · ref 41 · internal anchor
CARI4D is the first category-agnostic pipeline that produces metric-scale, spatially and temporally consistent 4D reconstructions of human-object interactions from monocular RGB videos via foundation-model hypothesis selection, render-and-compare refinement, and contact reasoning.
SVG360: Editable Multiview Vector Graphics from a Single SVG cs.CV · 2025-11-20 · unverdicted · none · ref 37 · internal anchor
SVG360 lifts a single SVG to a view-conditioned representation, uses spatial memory to propagate consistent parts across views, and applies structure-aware vectorization to produce editable multiview SVGs.
Structured 3D Latents for Scalable and Versatile 3D Generation cs.CV · 2024-12-02 · unverdicted · none · ref 87 · internal anchor
SLAT provides a unified 3D latent representation enabling versatile high-quality generation across multiple output formats from text or image inputs.
GeoFlow: Enforcing Implicit Geometric Consistency in Video Generation cs.CV · 2026-05-18 · unverdicted · none · ref 68 · internal anchor
GeoFlow adds a geometry-consistency reward based on rigid camera flow and object appearance preservation, integrated via reinforcement fine-tuning to improve geometric coherence in video generation.
PanoWorld: A Generative Spatial World Model for Consistent Whole-House Panorama Synthesis cs.CV · 2026-05-18 · unverdicted · none · ref 37 · internal anchor
PanoWorld autoregressively generates consistent multi-room 360-degree panoramas for whole-house VR using a floorplan-derived 3D shell as geometric proxy and a dynamic 3DGS cache for spatial memory.
Sat3DGen: Comprehensive Street-Level 3D Scene Generation from Single Satellite Image cs.CV · 2026-05-14 · unverdicted · none · ref 99 · internal anchor
Sat3DGen improves geometric RMSE from 6.76m to 5.20m and FID from ~40 to 19 for street-level 3D generation from satellite images via geometry-centric constraints and perspective training.
Toward Visually Realistic Simulation: A Benchmark for Evaluating Robot Manipulation in Simulation cs.RO · 2026-05-07 · unverdicted · none · ref 36 · internal anchor
VISER is a new visually realistic simulation benchmark for robot manipulation tasks that uses PBR materials and MLLM-assisted asset generation, achieving 0.92 Pearson correlation with real-world policy performance.
Prop-Chromeleon: Adaptive Haptic Props in Mixed Reality through Generative Artificial Intelligence cs.HC · 2026-05-01 · unverdicted · none · ref 80 · internal anchor
A generative-AI pipeline dynamically generates and anchors virtual assets to match the shape of physical props, enabling adaptive passive haptics in MR that users rate higher in realism, immersion, and enjoyment than static baselines.
Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective cs.CV · 2026-04-15 · unverdicted · none · ref 91 · internal anchor
The paper proposes a problem-driven taxonomy for feed-forward 3D scene modeling that groups methods by five core challenges: feature enhancement, geometry awareness, model efficiency, augmentation strategies, and temporal-aware modeling.
Lyra 2.0: Explorable Generative 3D Worlds cs.CV · 2026-04-14 · unverdicted · none · ref 105 · internal anchor
Lyra 2.0 produces persistent 3D-consistent video sequences for large explorable worlds by using per-frame geometry for information routing and self-augmented training to correct temporal drift.
A Semi-Automated Framework for 3D Reconstruction of Medieval Manuscript Miniatures cs.CV · 2026-04-08 · conditional · none · ref 26 · internal anchor
A pipeline using SAM segmentation and Hi3DGen mesh generation, evaluated on 69 medieval figures, produces usable 3D models for XR and tactile applications with Hi3DGen as the best starting point.
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models cs.CV · 2025-02-10 · unverdicted · none · ref 120 · internal anchor
TripoSG generates high-fidelity 3D meshes from input images via a large-scale rectified flow transformer and hybrid-trained 3D VAE on a custom 2-million-sample dataset, claiming state-of-the-art fidelity and generalization.
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models cs.CV · 2024-04-10 · unverdicted · none · ref 45 · internal anchor
InstantMesh produces diverse, high-quality 3D meshes from single images in seconds by combining a multi-view diffusion model with a sparse-view large reconstruction model and optimizing directly on meshes.
From Visual Synthesis to Interactive Worlds: Toward Production-Ready 3D Asset Generation cs.GR · 2026-04-26 · unverdicted · none · ref 22 · 2 links · internal anchor
The paper surveys 3D asset generation methods and organizes them around the full production pipeline to assess which outputs meet engine-level requirements for interactive applications.
AmaraSpatial-10K: A Spatially and Semantically Aligned 3D Dataset for Spatial Computing and Embodied AI cs.CV · 2026-04-24 · unverdicted · none · ref 12 · 2 links · internal anchor
AmaraSpatial-10K supplies 10K deployment-ready 3D assets with metric scaling and metadata, delivering 3.4x higher CLIP Recall@5 than Objaverse and 99.1% physics stability in Habitat-Sim.
UniMesh: Unifying 3D Mesh Understanding and Generation cs.CV · 2026-04-19 · unverdicted · none · ref 49 · internal anchor
UniMesh unifies 3D mesh generation and understanding in one model via a Mesh Head interface, Chain of Mesh iterative editing, and an Actor-Evaluator self-reflection loop.
FUNCanon: Learning Pose-Aware Action Primitives via Functional Object Canonicalization for Generalizable Robotic Manipulation cs.RO · 2025-09-23 · unverdicted · none · ref 22 · internal anchor
FunCanon introduces functional object canonicalization with VLM affordances to create pose-aware action primitives for generalizable imitation learning in robotic manipulation.
Stream3D: Sequential Multi-View 3D Generation via Evidential Memory cs.CV · 2026-05-20 · unreviewed · ref 64 · internal anchor
OpenWorldLib: A Unified Codebase and Definition of Advanced World Models cs.CV · 2026-04-06 · unreviewed · ref 118 · internal anchor

TripoSR: Fast 3D Object Reconstruction from a Single Image

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer