pith. machine review for the scientific record. sign in

arxiv: 1512.03012 · v1 · submitted 2015-12-09 · 💻 cs.GR · cs.AI· cs.CG· cs.CV· cs.RO

Recognition: 2 theorem links

· Lean Theorem

ShapeNet: An Information-Rich 3D Model Repository

Authors on Pith no claims yet

Pith reviewed 2026-05-11 16:02 UTC · model grok-4.3

classification 💻 cs.GR cs.AIcs.CGcs.CVcs.RO
keywords ShapeNet3D CAD modelsWordNet taxonomysemantic annotationscomputer graphicscomputer visionbenchmark datasetshape analysis
0
0 comments X

The pith

ShapeNet supplies over three million 3D CAD models classified into thousands of WordNet categories and equipped with alignments, parts, symmetries, sizes, and keywords.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents ShapeNet as a large repository of 3D models drawn from many semantic categories and structured according to the WordNet taxonomy. It supplies each model with multiple annotations including rigid alignments, part decompositions, bilateral symmetry planes, physical sizes, and descriptive keywords, all accessible via a public web interface. These resources are intended to support data visualization, drive geometric analysis, and furnish a common quantitative benchmark for computer graphics and vision research. A reader would care because prior progress in 2D vision relied on large labeled collections, and an analogous 3D resource could enable similar scaling of shape-based algorithms.

Core claim

ShapeNet contains 3D models from a multitude of semantic categories and organizes them under the WordNet taxonomy. It is a collection of datasets providing many semantic annotations for each 3D model such as consistent rigid alignments, parts and bilateral symmetry planes, physical sizes, keywords, as well as other planned annotations. Annotations are made available through a public web-based interface to enable data visualization of object attributes, promote data-driven geometric analysis, and provide a large-scale quantitative benchmark for research in computer graphics and vision. At the time of this technical report, ShapeNet has indexed more than 3,000,000 models, 220,000 models out of

What carries the argument

The ShapeNet repository, which indexes 3D CAD models under WordNet synsets and attaches geometric and semantic annotations to each model for standardized access.

If this is right

  • Algorithms for 3D shape retrieval, segmentation, and symmetry detection can be evaluated on a shared, large-scale test set rather than on small private collections.
  • The taxonomy structure permits category-specific and cross-category experiments that were previously difficult to organize.
  • The web interface lets researchers inspect annotations visually before using them in experiments.
  • Planned additional annotations will further expand the range of tasks that can be benchmarked with the same data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The scale and annotation density could support supervised learning of 3D representations at sizes comparable to those used in image classification.
  • Integration with image or text datasets might become straightforward once models carry both geometric and semantic labels.
  • The repository structure could serve as a template for similar collections in related domains such as scene understanding or robotic grasping.

Load-bearing premise

The collected CAD models are representative of real objects and the supplied annotations are accurate, consistent, and of sufficient quality to function as a reliable benchmark.

What would settle it

An independent check revealing that a large fraction of models are misclassified relative to their WordNet labels or that symmetry and part annotations disagree with human judgment on more than a small percentage of items.

read the original abstract

We present ShapeNet: a richly-annotated, large-scale repository of shapes represented by 3D CAD models of objects. ShapeNet contains 3D models from a multitude of semantic categories and organizes them under the WordNet taxonomy. It is a collection of datasets providing many semantic annotations for each 3D model such as consistent rigid alignments, parts and bilateral symmetry planes, physical sizes, keywords, as well as other planned annotations. Annotations are made available through a public web-based interface to enable data visualization of object attributes, promote data-driven geometric analysis, and provide a large-scale quantitative benchmark for research in computer graphics and vision. At the time of this technical report, ShapeNet has indexed more than 3,000,000 models, 220,000 models out of which are classified into 3,135 categories (WordNet synsets). In this report we describe the ShapeNet effort as a whole, provide details for all currently available datasets, and summarize future plans.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript presents ShapeNet as a large-scale repository of 3D CAD models, with more than 3 million models indexed and 220,000 classified into 3,135 WordNet categories. It details the provision of semantic annotations including consistent rigid alignments, parts, bilateral symmetry planes, physical sizes, and keywords, accessible via a public web-based interface intended to promote data-driven geometric analysis and serve as a quantitative benchmark for computer graphics and vision research.

Significance. If the annotations prove accurate and consistent, ShapeNet would be a highly significant resource, providing unprecedented scale and semantic richness for 3D shape research. The use of WordNet taxonomy for organization and the variety of annotations (alignments, parts, symmetry) address key needs in the field for standardized data, potentially enabling new data-driven methods similar to those facilitated by large 2D datasets.

major comments (1)
  1. [Abstract] The positioning of ShapeNet as a 'large-scale quantitative benchmark' (Abstract) is undermined by the absence of any description of the annotation methodology, quality assurance processes, or validation metrics (such as accuracy or consistency measures) for the provided annotations like part labels and symmetry planes.
minor comments (1)
  1. The phrasing in the abstract '220,000 models out of which are classified into 3,135 categories' is slightly awkward and could be clarified for readability.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive evaluation of ShapeNet's potential impact and for the constructive feedback. We address the single major comment below and will revise the manuscript to strengthen the presentation of annotation details.

read point-by-point responses
  1. Referee: [Abstract] The positioning of ShapeNet as a 'large-scale quantitative benchmark' (Abstract) is undermined by the absence of any description of the annotation methodology, quality assurance processes, or validation metrics (such as accuracy or consistency measures) for the provided annotations like part labels and symmetry planes.

    Authors: We agree that the abstract's reference to a 'large-scale quantitative benchmark' would be better supported by explicit discussion of how the annotations were produced and validated. The current manuscript describes the types of annotations provided (alignments, parts, symmetry planes, etc.) and their intended uses but does not detail the underlying pipelines, crowdsourcing protocols, or any quantitative quality metrics. In the revised version we will add a new section (or subsection) that outlines the annotation methodology for each major attribute, including the tools and semi-automatic procedures employed, the quality-assurance steps taken, and any consistency or accuracy checks that were performed during data collection. We will also clarify that comprehensive per-annotation validation numbers remain an ongoing effort and will be reported as they become available. revision: yes

Circularity Check

0 steps flagged

No circularity: purely descriptive dataset report with no derivations or predictions

full rationale

This technical report describes the curation, taxonomy organization, and annotation of the ShapeNet repository without any mathematical derivations, equations, predictions, fitted parameters, or first-principles results. All claims are factual statements about data collection scale, WordNet synset classification, and annotation types (alignments, parts, symmetry planes, sizes). No load-bearing step reduces by construction to self-definition, self-citation, or renaming; the paper contains no derivation chain to inspect. It is self-contained as a data-release document.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a dataset presentation paper with no mathematical derivations, so it introduces no free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5524 in / 1242 out tokens · 93007 ms · 2026-05-11T16:02:46.819367+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 58 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Towards Realistic 3D Emission Materials: Dataset, Baseline, and Evaluation for Emission Texture Generation

    cs.CV 2026-04 unverdicted novelty 8.0

    The work creates the first dataset and baseline for generating emission textures on 3D objects to reproduce glowing materials from input images.

  2. Min Generalized Sliced Gromov Wasserstein: A Scalable Path to Gromov Wasserstein

    cs.LG 2026-05 unverdicted novelty 7.0

    min-GSGW learns coupled nonlinear slicers to produce a rigid-motion-invariant, scalable approximation to the Gromov-Wasserstein distance and its transport plans.

  3. Img2CADSeq: Image-to-CAD Generation via Sequence-Based Diffusion

    cs.CV 2026-05 unverdicted novelty 7.0

    Img2CADSeq generates standard CAD sequences from images via a multi-stage pipeline with three-level hierarchical codebook encoding, importance-guided compression, and contrastive point-cloud conditioning of a VQ-Diffu...

  4. Count Anything at Any Granularity

    cs.CV 2026-05 unverdicted novelty 7.0

    Multi-grained counting is introduced with five granularity levels, supported by the new KubriCount dataset generated via 3D synthesis and editing, and HieraCount model that combines text and visual exemplars for impro...

  5. The Wittgensteinian Representation Hypothesis: Is Language the Attractor of Multimodal Convergence?

    cs.AI 2026-05 unverdicted novelty 7.0

    Language representations serve as the asymptotic attractor for convergence in independently trained multimodal neural networks due to feature density asymmetry.

  6. MeshFIM: Local Low-Poly Mesh Editing via Fill-in-the-Middle Autoregressive Generation

    cs.GR 2026-05 unverdicted novelty 7.0

    MeshFIM enables local low-poly mesh editing by autoregressively filling target regions conditioned on context, using boundary markers, positional embeddings, and a gated geometry encoder to enforce attachment, topolog...

  7. Rollback-Free Stable Brick Structures Generation

    cs.LG 2026-05 unverdicted novelty 7.0

    Reinforcement learning internalizes physical stability rules for brick structures, enabling the first rollback-free generation with orders-of-magnitude faster inference.

  8. Two Steps Are All You Need: Efficient 3D Point Cloud Anomaly Detection with Consistency Models

    cs.CV 2026-05 unverdicted novelty 7.0

    Consistency learning reformulates 3D point cloud anomaly detection to predict clean geometry directly in one or two steps, yielding up to 80 times faster inference while matching state-of-the-art accuracy.

  9. ADS: Random Sampling of Occupancy Functions using Adaptive Delaunay Scaffolding

    cs.GR 2026-05 unverdicted novelty 7.0

    ADS adaptively refines a Delaunay scaffold to produce unbiased random samples on occupancy function surfaces together with a connecting mesh, using far fewer evaluations than existing approaches.

  10. Generative Modeling with Orbit-Space Particle Flow Matching

    cs.GR 2026-05 unverdicted novelty 7.0

    OGPP is a particle flow-matching method using orbit-space canonicalization and geometric paths that achieves lower error and fewer steps than prior approaches on 3D benchmarks.

  11. AirZoo: A Unified Large-Scale Dataset for Grounding Aerial Geometric 3D Vision

    cs.CV 2026-04 conditional novelty 7.0

    AirZoo is a new large-scale synthetic dataset for aerial 3D vision that improves state-of-the-art models on image retrieval, cross-view matching, and 3D reconstruction when used for fine-tuning.

  12. 3D Generation for Embodied AI and Robotic Simulation: A Survey

    cs.RO 2026-04 accept novelty 7.0

    3D generation for embodied AI is shifting from visual realism toward interaction readiness, organized into data generation, simulation environments, and sim-to-real bridging roles.

  13. AmaraSpatial-10K: A Spatially and Semantically Aligned 3D Dataset for Spatial Computing and Embodied AI

    cs.CV 2026-04 unverdicted novelty 7.0

    AmaraSpatial-10K is a new dataset of over 10,000 metric-scaled and semantically anchored 3D assets that achieves 3.4 times higher text retrieval precision than Objaverse for embodied AI and spatial computing.

  14. Topo-ADV: Generating Topology-Driven Imperceptible Adversarial Point Clouds

    cs.CV 2026-04 unverdicted novelty 7.0

    Topo-ADV uses differentiable persistent homology to create topology-altering perturbations that achieve up to 100% attack success on point cloud classifiers like PointNet while remaining geometrically imperceptible.

  15. Training-free Spatially Grounded Geometric Shape Encoding (Technical Report)

    cs.CV 2026-04 unverdicted novelty 7.0

    XShapeEnc encodes arbitrary 2D spatially grounded shapes into compact invertible representations by decomposing them into unit-disk geometry and harmonic pose fields then applying Zernike bases with frequency propagation.

  16. 3D-Fixer: Coarse-to-Fine In-place Completion for 3D Scenes from a Single Image

    cs.CV 2026-04 unverdicted novelty 7.0

    3D-Fixer performs in-place 3D asset completion from single-view partial point clouds via coarse-to-fine generation with ORFA conditioning, plus a new ARSG-110K dataset, to achieve higher geometric accuracy than MIDI a...

  17. Deformation-based In-Context Learning for Point Cloud Understanding

    cs.CV 2026-04 unverdicted novelty 7.0

    DeformPIC deforms query point clouds under prompt guidance for in-context learning, outperforming prior methods with lower Chamfer Distance on reconstruction, denoising, and registration tasks.

  18. Fast Graph Representation Learning with PyTorch Geometric

    cs.LG 2019-03 accept novelty 7.0

    PyTorch Geometric is a PyTorch library that delivers fast graph neural network training through sparse GPU kernels and variable-size mini-batching.

  19. Sat3DGen: Comprehensive Street-Level 3D Scene Generation from Single Satellite Image

    cs.CV 2026-05 unverdicted novelty 6.0

    Sat3DGen improves geometric RMSE from 6.76m to 5.20m and FID from ~40 to 19 for street-level 3D generation from satellite images via geometry-centric constraints and perspective training.

  20. ObjView-Bench: Rethinking Difficulty and Deployment for Object-Centric View Planning

    cs.RO 2026-05 unverdicted novelty 6.0

    ObjView-Bench disentangles omnidirectional self-occlusion, saturation difficulty, and set-cover planning difficulty, then shows that budget regimes and reachable-view constraints change planner rankings and failure mo...

  21. GenMed: A Pairwise Generative Reformulation of Medical Diagnostic Tasks

    cs.CV 2026-05 unverdicted novelty 6.0

    GenMed uses diffusion models to capture P(X,Y) for medical tasks and performs inference via gradient-based test-time optimization, supporting arbitrary observation combinations without retraining.

  22. Beyond Spatial Compression: Interface-Centric Generative States for Open-World 3D Structure

    cs.LG 2026-05 unverdicted novelty 6.0

    C2LT-3D factorizes 3D tokenization into canonical local geometry, partition-conditioned context, and relational seam variables to make latent states operational for assembly-level validation and repair in open-world m...

  23. Minimax Optimal Estimation of Transport-Growth Pairs in Unbalanced Optimal Transport

    math.ST 2026-05 unverdicted novelty 6.0

    Estimators for transport-growth pairs in unbalanced OT achieve minimax optimal rates, supported by a value-based stability reduction through a UOT gap condition.

  24. Toward Visually Realistic Simulation: A Benchmark for Evaluating Robot Manipulation in Simulation

    cs.RO 2026-05 unverdicted novelty 6.0

    VISER is a new visually realistic simulation benchmark for robot manipulation tasks that uses PBR materials and MLLM-assisted asset generation, achieving 0.92 Pearson correlation with real-world policy performance.

  25. Prop-Chromeleon: Adaptive Haptic Props in Mixed Reality through Generative Artificial Intelligence

    cs.HC 2026-05 unverdicted novelty 6.0

    A generative-AI pipeline dynamically generates and anchors virtual assets to match the shape of physical props, enabling adaptive passive haptics in MR that users rate higher in realism, immersion, and enjoyment than ...

  26. TAFA-GSGC: Group-wise Scalable Point Cloud Geometry Compression with Progressive Residual Refinement

    cs.CV 2026-04 unverdicted novelty 6.0

    TAFA-GSGC delivers scalable point cloud geometry compression supporting up to nine monotonic quality levels from a single trained model and bitstream while matching or slightly exceeding PCGCv2 rate-distortion performance.

  27. TAFA-GSGC: Group-wise Scalable Point Cloud Geometry Compression with Progressive Residual Refinement

    cs.CV 2026-04 unverdicted novelty 6.0

    TAFA-GSGC is a scalable point cloud geometry compression codec using progressive residual refinement and group-wise entropy coding that achieves average BD-rate reductions of 4.99% (D1-PSNR) and 5.92% (D2-PSNR) over P...

  28. ShapeY: A Principled Framework for Measuring Shape Recognition Capacity via Nearest-Neighbor Matching

    cs.CV 2026-04 unverdicted novelty 6.0

    ShapeY is a benchmark dataset and nearest-neighbor protocol that measures shape-based recognition in vision models, revealing that even state-of-the-art networks fail to generalize consistently across 3D viewpoints an...

  29. Point-MF: One-step Point Cloud Generation from a Single Image via Mean Flows

    cs.CV 2026-04 unverdicted novelty 6.0

    Point-MF performs one-step point cloud reconstruction from single images by learning a mean velocity field in point space with a tailored Diffusion Transformer and a new auxiliary loss.

  30. Text-Guided Multimodal Unified Industrial Anomaly Detection

    cs.CV 2026-04 unverdicted novelty 6.0

    A text-semantics-guided multimodal framework with geometry-aware mapping and object-conditioned text adaptation achieves state-of-the-art unsupervised anomaly detection and localization on RGB-3D industrial datasets w...

  31. FILTR: Extracting Topological Features from Pretrained 3D Models

    cs.CV 2026-04 unverdicted novelty 6.0

    FILTR predicts persistence diagrams from pretrained 3D encoders on the new DONUT benchmark, showing limited topological signals in encoders but successful approximation via learnable feed-forward.

  32. FurnSet: Exploiting Repeats for 3D Scene Reconstruction

    cs.CV 2026-04 unverdicted novelty 6.0

    FurnSet improves single-view 3D scene reconstruction by using per-object CLS tokens and set-aware self-attention to group and jointly reconstruct repeated object instances, with added scene-object conditioning and lay...

  33. Volume Transformer: Revisiting Vanilla Transformers for 3D Scene Understanding

    cs.CV 2026-04 unverdicted novelty 6.0

    A minimally modified vanilla Transformer called Volt achieves state-of-the-art 3D semantic and instance segmentation by using volumetric tokens, 3D rotary embeddings, and a data-efficient training recipe that scales b...

  34. One-Shot Cross-Geometry Skill Transfer through Part Decomposition

    cs.RO 2026-04 unverdicted novelty 6.0

    Part decomposition with generative shape models allows one-shot robot skill transfer across unfamiliar object geometries in simulation and real settings.

  35. Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective

    cs.CV 2026-04 unverdicted novelty 6.0

    The paper proposes a problem-driven taxonomy for feed-forward 3D scene modeling that groups methods by five core challenges: feature enhancement, geometry awareness, model efficiency, augmentation strategies, and temp...

  36. ReplicateAnyScene: Zero-Shot Video-to-3D Composition via Textual-Visual-Spatial Alignment

    cs.CV 2026-04 unverdicted novelty 6.0

    ReplicateAnyScene performs fully automated zero-shot video-to-compositional-3D reconstruction by cascading alignments of generic priors from vision foundation models across textual, visual, and spatial dimensions.

  37. L-PCN: A Point Cloud Accelerator Exploiting Spatial Locality through Octree-based Islandization

    cs.AR 2026-04 unverdicted novelty 6.0

    L-PCN exploits spatial locality in point cloud networks via octree partitioning into islands and intra-island hub scheduling, delivering 55-94% less feature fetching, 45-81% less computation, and 1.2-3.2x additional s...

  38. TouchAnything: Diffusion-Guided 3D Reconstruction from Sparse Robot Touches

    cs.CV 2026-04 unverdicted novelty 6.0

    TouchAnything reconstructs accurate 3D object geometries from only a few tactile contacts by optimizing for consistency with a pretrained visual diffusion prior.

  39. Training-free Spatially Grounded Geometric Shape Encoding (Technical Report)

    cs.CV 2026-04 unverdicted novelty 6.0

    XShapeEnc decomposes 2D shapes into unit-disk geometry and harmonic pose, encodes both with orthogonal Zernike bases, and applies frequency propagation to produce invertible, adaptive, frequency-rich representations.

  40. Part-Level 3D Gaussian Vehicle Generation with Joint and Hinge Axis Estimation

    cs.AI 2026-04 unverdicted novelty 6.0

    A new framework generates part-level animatable 3D Gaussian vehicles from images by adding modules for exclusive part ownership and kinematic joint/axis prediction.

  41. FusionBERT: Multi-View Image-3D Retrieval via Cross-Attention Visual Fusion and Normal-Aware 3D Encoder

    cs.CV 2026-04 unverdicted novelty 6.0

    FusionBERT uses cross-attention to fuse multi-view images and a normal-aware encoder for 3D models, achieving higher image-3D retrieval accuracy than prior multimodal models in both single- and multi-view settings.

  42. SAM 3D: 3Dfy Anything in Images

    cs.CV 2025-11 unverdicted novelty 6.0

    SAM 3D reconstructs 3D objects from single images with geometry, texture, and pose using human-model annotated data at scale and synthetic-to-real training, achieving 5:1 human preference wins.

  43. ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis

    cs.CV 2024-09 unverdicted novelty 6.0

    ViewCrafter tames video diffusion models with point-based 3D guidance and iterative trajectory planning to produce high-fidelity novel views from single or sparse images.

  44. DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

    cs.RO 2024-03 accept novelty 6.0

    DROID is a new 76k-trajectory in-the-wild robot manipulation dataset spanning 564 scenes and 84 tasks that improves policy performance and generalization when used for training.

  45. EvObj: Learning Evolving Object-centric Representations for 3D Instance Segmentation without Scene Supervision

    cs.CV 2026-05 unverdicted novelty 5.0

    EvObj learns evolving object-centric representations for unsupervised 3D instance segmentation by dynamically refining object candidates and completing partial geometries to bridge the synthetic-to-real domain gap, ou...

  46. Syn4D: A Multiview Synthetic 4D Dataset

    cs.CV 2026-05 unverdicted novelty 5.0

    Syn4D is a new multiview synthetic 4D dataset supplying dense ground-truth annotations for dynamic scene reconstruction, tracking, and human pose estimation.

  47. Channel-Level Relation to Attentive Aggregation with Neighborhood-Homogeneity Constraint for Point Cloud Analysis

    cs.CV 2026-05 unverdicted novelty 5.0

    PointCRA reduces information loss in deep point cloud networks by treating temporal trend variation as an extra evaluation dimension alongside spatial and channel attention, guided by a neighborhood homogeneity constraint.

  48. From Visual Synthesis to Interactive Worlds: Toward Production-Ready 3D Asset Generation

    cs.GR 2026-04 unverdicted novelty 5.0

    The paper surveys 3D asset generation methods and organizes them around the full production pipeline to assess which outputs meet engine-level requirements for interactive applications.

  49. AmaraSpatial-10K: A Spatially and Semantically Aligned 3D Dataset for Spatial Computing and Embodied AI

    cs.CV 2026-04 unverdicted novelty 5.0

    AmaraSpatial-10K supplies 10K deployment-ready 3D assets with metric scaling and metadata, delivering 3.4x higher CLIP Recall@5 than Objaverse and 99.1% physics stability in Habitat-Sim.

  50. Unposed-to-3D: Learning Simulation-Ready Vehicles from Real-World Images

    cs.CV 2026-04 unverdicted novelty 5.0

    Unposed-to-3D learns simulation-ready 3D vehicle models from unposed real images by predicting camera parameters for photometric self-supervision, then adding scale prediction and harmonization.

  51. Neural Distribution Prior for LiDAR Out-of-Distribution Detection

    cs.CV 2026-04 unverdicted novelty 5.0

    NDP models prediction distributions and uses Perlin noise OOD synthesis to reach 61.31% point-level AP on STU LiDAR benchmark, over 10x prior best.

  52. Channel-Level Relation to Attentive Aggregation with Neighborhood-Homogeneity Constraint for Point Cloud Analysis

    cs.CV 2026-05 unverdicted novelty 4.0

    PointCRA improves point cloud feature aggregation by using channel-level metrics with temporal trend variation and neighborhood-homogeneity calibration to enhance discriminability and reduce weight collapse in deep networks.

  53. RETO: A Rotary-Enhanced Transformer Operator for High-Fidelity Prediction of Automotive Aerodynamics

    eess.IV 2026-04 unverdicted novelty 4.0

    RETO achieves relative L2 errors of 0.063 on ShapeNet and 0.089/0.097 on DrivAerML surface pressure/velocity, outperforming Transolver and other baselines.

  54. From Visual Synthesis to Interactive Worlds: Toward Production-Ready 3D Asset Generation

    cs.GR 2026-04 unverdicted novelty 4.0

    The paper surveys 3D content generation literature using a taxonomy of asset types and production stages to evaluate progress toward engine-ready assets.

  55. Reinforcing 3D Understanding in Point-VLMs via Geometric Reward Credit Assignment

    cs.CV 2026-04 unverdicted novelty 4.0

    Geometric Reward Credit Assignment disentangles rewards to geometric tokens and adds reprojection consistency to boost 3D keypoint accuracy from 0.64 to 0.93 and bounding box IoU to 0.686 on a ShapeNetCore benchmark w...

  56. 3D Generation for Embodied AI and Robotic Simulation: A Survey

    cs.RO 2026-04 unverdicted novelty 3.0

    The survey organizes 3D generation for embodied AI into data generators for assets, simulation environments for interaction, and sim-to-real bridges, noting a shift toward interaction readiness and listing bottlenecks...

  57. Attention Is not Everything: Efficient Alternatives for Vision

    cs.CV 2026-04 unverdicted novelty 3.0

    A survey that taxonomizes non-Transformer vision models and evaluates their practical trade-offs across efficiency, scalability, and robustness.

  58. 3D Generation for Embodied AI and Robotic Simulation: A Survey

    cs.RO 2026-04 unverdicted novelty 2.0

    The paper surveys 3D generation techniques for embodied AI and robotics, categorizing them into data generation, simulation environments, and sim-to-real bridging while identifying bottlenecks in physical validity and...

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages · cited by 51 Pith papers

  1. [1]

    The protein data bank

    Helen M Berman, John Westbrook, Zukang Feng, Gary Gilliland, TN Bhat, Helge Weissig, Ilya N Shindyalov, and Philip E Bourne. The protein data bank. Nucleic Acids Res, 28:235–242, 2000. 2

  2. [2]

    A benchmark for 3D mesh segmentation

    Xiaobai Chen, Aleksey Golovinskiy, and Thomas Funkhouser. A benchmark for 3D mesh segmentation. ACM TOG, 28(3):73:1–73:12, July 2009. 2 9

  3. [3]

    Schelling points on 3D surface meshes

    Xiaobai Chen, Abulhair Saparov, Bill Pang, and Thomas Funkhouser. Schelling points on 3D surface meshes. ACM TOG, August 2012. 2

  4. [4]

    ImageNet: A large-scale hierarchical image database

    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. ImageNet: A large-scale hierarchical image database. In CVPR, 2009. 1, 2, 4

  5. [5]

    Aim@shape

    Bianca Falcidieno. Aim@shape. http://www. aimatshape.net/ontologies/shapes/, 2005. 2

  6. [6]

    Example-based synthesis of 3D object arrangements

    Matthew Fisher, Daniel Ritchie, Manolis Savva, Thomas Funkhouser, and Pat Hanrahan. Example-based synthesis of 3D object arrangements. ACM TOG, 31(6):135, 2012. 1

  7. [7]

    Paul-Louis George. Gamma. http://www.rocq. inria.fr/gamma/download/download.php,

  8. [8]

    Fine-grained semi-supervised labeling of large shape collections

    Qixing Huang, Hao Su, and Leonidas Guibas. Fine-grained semi-supervised labeling of large shape collections. ACM TOG, 32:190:1–190:10, 2013. 4, 11

  9. [9]

    Developing an engineering shape benchmark for CAD models.Computer-Aided Design,

    Subramaniam Jayanti, Yagnanarayanan Kalyanaraman, Na- traj Iyer, and Karthik Ramani. Developing an engineering shape benchmark for CAD models.Computer-Aided Design,

  10. [10]

    A probabilistic model for component-based shape synthesis

    Evangelos Kalogerakis, Siddhartha Chaudhuri, Daphne Koller, and Vladlen Koltun. A probabilistic model for component-based shape synthesis. ACM TOG, 31:55, 2012. 1

  11. [11]

    Mobius transformations for global intrinsic symmetry analysis

    Vladimir Kim, Yaron Lipman, Xiaobai Chen, and Thomas Funkhouser. Mobius transformations for global intrinsic symmetry analysis. Symposium on Geometry Processing , July 2010. 2

  12. [12]

    Kim, Wilmot Li, Niloy J

    Vladimir G. Kim, Wilmot Li, Niloy J. Mitra, Siddhartha Chaudhuri, Stephen DiVerdi, and Thomas Funkhouser. Learning part-based templates from large collections of 3D shapes. ACM TOG, 32(4):70:1–70:12, July 2013. 2

  13. [13]

    Kim, Wilmot Li, Niloy J

    Vladimir G. Kim, Wilmot Li, Niloy J. Mitra, Stephen DiVerdi, and Thomas Funkhouser. Exploring collections of 3D models using fuzzy correspondences. ACM TOG , 31(4):54:1–54:11, July 2012. 2, 4

  14. [14]

    3D object representations for fine-grained categorization

    Jonathan Krause, Michael Stark, Jia Deng, and Li Fei-Fei. 3D object representations for fine-grained categorization. In 4th International IEEE Workshop on 3D Representation and Recognition (3dRR-13), Sydney, Australia, 2013. 2

  15. [15]

    PDBsum: A web-based database of summaries and analyses of all PDB structures

    Roman A Laskowski, E Gail Hutchinson, Alex D Michie, Andrew C Wallace, Martin L Jones, and Janet M Thornton. PDBsum: A web-based database of summaries and analyses of all PDB structures. Trends Biochem. Sci. , 22:488–490,

  16. [16]

    SHREC’12 track: generic 3D shape retrieval

    Bo Li, Afzal Godil, Masaki Aono, X Bai, Takahiko Furuya, L Li, R L´opez-Sastre, Henry Johan, Ryutarou Ohbuchi, Car- olina Redondo-Cabrera, et al. SHREC’12 track: generic 3D shape retrieval. In 5th Eurographics Conference on 3D Ob- ject Retrieval, 2012. 3

  17. [17]

    SHREC’14 track: Large scale comprehensive 3D shape retrieval

    Bo Li, Yijuan Lu, Chunyuan Li, Afzal Godil, Tobias Schreck, Masaki Aono, Qiang Chen, Nihad Karim Chowd- hury, Bin Fang, Takahiko Furuya, et al. SHREC’14 track: Large scale comprehensive 3D shape retrieval. In Euro- graphics Workshop on 3D Object Retrieval, 2014. 2

  18. [18]

    Multi-view object class detection with a 3D geometric model

    Joerg Liebelt and Cordelia Schmid. Multi-view object class detection with a 3D geometric model. InCVPR, pages 1688–

  19. [19]

    Kim, Qi- Xing Huang, Niloy J

    Tianqiang Liu, Siddhartha Chaudhuri, Vladimir G. Kim, Qi- Xing Huang, Niloy J. Mitra, and Thomas Funkhouser. Cre- ating consistent scene graphs using a probabilistic grammar. ACM TOG, December 2014. 2

  20. [20]

    Building a large annotated corpus of english: The Penn Treebank

    Mitchell P Marcus, Mary Ann Marcinkiewicz, and Beatrice Santorini. Building a large annotated corpus of english: The Penn Treebank. Computational linguistics, 19(2):313–330,

  21. [21]

    George A. Miller. WordNet: a lexical database for English. CACM, 1995. 1, 2, 3, 4

  22. [22]

    Symmetry in 3D geometry: Extraction and applications

    Niloy J Mitra, Mark Pauly, Michael Wand, and Duygu Cey- lan. Symmetry in 3D geometry: Extraction and applications. In Computer Graphics Forum, volume 32, pages 1–23, 2013. 7

  23. [23]

    Nooruddin and Greg Turk

    Fakir S. Nooruddin and Greg Turk. Simplification and repair of polygonal models using volumetric techniques. Visualiza- tion and Computer Graphics, IEEE Transactions on , 2003. 7

  24. [24]

    Building a database of 3D scenes from user annotations

    Bryan C Russell and Antonio Torralba. Building a database of 3D scenes from user annotations. In CVPR, 2009. 2

  25. [25]

    Chang, Gilbert Bernstein, Christo- pher D

    Manolis Savva, Angel X. Chang, Gilbert Bernstein, Christo- pher D. Manning, and Pat Hanrahan. On being the right scale: Sizing large collections of 3D models. In SIGGRAPH Asia 2014 Workshop on Indoor Scene Understanding: Where Graphics meets Vision, 2014. 7

  26. [26]

    Chang, and Pat Hanrahan

    Manolis Savva, Angel X. Chang, and Pat Hanrahan. Semantically-Enriched 3D Models for Common-sense Knowledge. CVPR 2015 Workshop on Functionality, Physics, Intentionality and Causality, 2015. 7

  27. [27]

    The Princeton shape benchmark

    Philip Shilane, Patrick Min, Michael Kazhdan, and Thomas Funkhouser. The Princeton shape benchmark. In Shape Modeling Applications. IEEE, 2004. 2, 3

  28. [28]

    Sliding shapes for 3D ob- ject detection in depth images

    Shuran Song and Jianxiong Xiao. Sliding shapes for 3D ob- ject detection in depth images. In ECCV, 2014. 1

  29. [29]

    A large-scale shape benchmark for 3D object retrieval: Toy- ohashi shape benchmark

    Atsushi Tatsuma, Hitoshi Koyanagi, and Masaki Aono. A large-scale shape benchmark for 3D object retrieval: Toy- ohashi shape benchmark. In Asia Pacific Signal and Infor- mation Processing Association, 2012. 3

  30. [30]

    La- belMe: Online image annotation and applications

    Antonio Torralba, Bryan C Russell, and Jenny Yuen. La- belMe: Online image annotation and applications. Proceed- ings of the IEEE, 98(8):1467–1484, 2010. 7

  31. [31]

    Veltkamp and FB ter Harr

    Remco C. Veltkamp and FB ter Harr. SHREC 2007 3D shape retrieval contest. Technical report, Utrecht University Tech- nical Report UU-CS-2007-015, 2007. 3

  32. [32]

    3D model retrieval

    Dejan V Vrani ´c. 3D model retrieval. University of Leipzig, Germany, PhD thesis, 2004. 3 10

  33. [33]

    A 3D shape benchmark for retrieval and automatic classification of ar- chitectural data

    Raoul Wessel, Ina Bl ¨umel, and Reinhard Klein. A 3D shape benchmark for retrieval and automatic classification of ar- chitectural data. In Eurographics 2009 Workshop on 3D Ob- ject Retrieval, pages 53–56. The Eurographics Association,

  34. [34]

    3D ShapeNets: A Deep Representation for V olumetric Shapes

    Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Lin- guang Zhang, Xiaoou Tang, and Jianxiong Xiao. 3D ShapeNets: A Deep Representation for V olumetric Shapes. CVPR, 2015. 1, 2, 4

  35. [35]

    Beyond PASCAL: A benchmark for 3D object detection in the wild

    Yu Xiang, Roozbeh Mottaghi, and Silvio Savarese. Beyond PASCAL: A benchmark for 3D object detection in the wild. In WACV, 2014. 2, 7

  36. [36]

    SUN3D: A database of big spaces reconstructed using SfM and object labels

    Jianxiong Xiao, Andrew Owens, and Antonio Torralba. SUN3D: A database of big spaces reconstructed using SfM and object labels. In ICCV, pages 1625–1632, 2013. 2

  37. [37]

    Retrieving articulated 3-D mod- els using medial surfaces and their graph spectra

    Juan Zhang, Kaleem Siddiqi, Diego Macrini, Ali Shokoufan- deh, and Sven Dickinson. Retrieving articulated 3-D mod- els using medial surfaces and their graph spectra. In Energy minimization methods in computer vision and pattern recog- nition, 2005. 3 A. Appendix A.1. Hierarchical Rigid Alignment In the following, we describe our hierarchical rigid align- ...