Uni3r: Unified 3d re- construction and semantic understanding via generalizable gaussian splatting from unposed multi-view images

· 2025 · arXiv 2508.03643

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

3AM: 3egment Anything with Geometric Consistency in Videos

cs.CV · 2026-01-13 · unverdicted · novelty 7.0

3AM integrates MUSt3R 3D features into SAM2 via a Feature Merger and FOV-aware sampling to deliver geometry-consistent video object segmentation from RGB alone, with large gains on wide-baseline datasets.

Bridging 3D Gaussians and Semantic Occupancy for Comprehensive Open-Vocabulary Scene Understanding from Unposed Images

cs.CV · 2026-07-02 · unverdicted · novelty 6.0

COVScene is a pose-free framework that lifts semantic Gaussians into a volumetric occupancy field during training to jointly support novel view synthesis, open-vocabulary segmentation, and semantic occupancy prediction.

EPS3D: End-to-End Feed-Forward 3D Panoptic Segmentation

cs.CV · 2026-06-08 · unverdicted · novelty 6.0

EPS3D is an end-to-end architecture for 3D panoptic segmentation from multi-view images that uses distillation and semantic-instance mutual enhancement to achieve higher benchmark performance and speed than prior methods.

Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective

cs.CV · 2026-04-15 · unverdicted · novelty 6.0

The paper proposes a problem-driven taxonomy for feed-forward 3D scene modeling that groups methods by five core challenges: feature enhancement, geometry awareness, model efficiency, augmentation strategies, and temporal-aware modeling.

FLEG: Feed-Forward Language Embedded Gaussian Splatting from Any Views via Compact Semantic Representation

cs.CV · 2025-12-19 · unverdicted · novelty 6.0

FLEG reconstructs language-embedded 3D Gaussians from arbitrary input views using a dual-branch distillation framework and a sparse set of semantic Gaussians that requires only 5% of prior embeddings.

Learning 3D Representations for Spatial Intelligence from Unposed Multi-View Images

cs.CV · 2026-04-12 · unverdicted · novelty 5.0

UniSplat learns consistent 3D geometry, appearance, and semantics from unposed images using dual masking, progressive Gaussian splatting, and recalibration to align predictions across tasks.

FF3R: Feedforward Feature 3D Reconstruction from Unconstrained views

cs.CV · 2026-04-10 · unverdicted · novelty 5.0

FF3R unifies geometric and semantic 3D reconstruction in a single annotation-free feed-forward network trained solely via RGB and feature rendering supervision.

citing papers explorer

Showing 7 of 7 citing papers after filters.

3AM: 3egment Anything with Geometric Consistency in Videos cs.CV · 2026-01-13 · unverdicted · none · ref 72
3AM integrates MUSt3R 3D features into SAM2 via a Feature Merger and FOV-aware sampling to deliver geometry-consistent video object segmentation from RGB alone, with large gains on wide-baseline datasets.
Bridging 3D Gaussians and Semantic Occupancy for Comprehensive Open-Vocabulary Scene Understanding from Unposed Images cs.CV · 2026-07-02 · unverdicted · none · ref 25
COVScene is a pose-free framework that lifts semantic Gaussians into a volumetric occupancy field during training to jointly support novel view synthesis, open-vocabulary segmentation, and semantic occupancy prediction.
EPS3D: End-to-End Feed-Forward 3D Panoptic Segmentation cs.CV · 2026-06-08 · unverdicted · none · ref 18
EPS3D is an end-to-end architecture for 3D panoptic segmentation from multi-view images that uses distillation and semantic-instance mutual enhancement to achieve higher benchmark performance and speed than prior methods.
Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective cs.CV · 2026-04-15 · unverdicted · none · ref 117
The paper proposes a problem-driven taxonomy for feed-forward 3D scene modeling that groups methods by five core challenges: feature enhancement, geometry awareness, model efficiency, augmentation strategies, and temporal-aware modeling.
FLEG: Feed-Forward Language Embedded Gaussian Splatting from Any Views via Compact Semantic Representation cs.CV · 2025-12-19 · unverdicted · none · ref 28
FLEG reconstructs language-embedded 3D Gaussians from arbitrary input views using a dual-branch distillation framework and a sparse set of semantic Gaussians that requires only 5% of prior embeddings.
Learning 3D Representations for Spatial Intelligence from Unposed Multi-View Images cs.CV · 2026-04-12 · unverdicted · none · ref 59
UniSplat learns consistent 3D geometry, appearance, and semantics from unposed images using dual masking, progressive Gaussian splatting, and recalibration to align predictions across tasks.
FF3R: Feedforward Feature 3D Reconstruction from Unconstrained views cs.CV · 2026-04-10 · unverdicted · none · ref 34
FF3R unifies geometric and semantic 3D reconstruction in a single annotation-free feed-forward network trained solely via RGB and feature rendering supervision.

Uni3r: Unified 3d re- construction and semantic understanding via generalizable gaussian splatting from unposed multi-view images

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer