CoRR abs/2105.15203(2021),https://arxiv.org/abs/2105.15203

Xie, E · 2021 · arXiv 2105.15203

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

TripVVT: A Large-Scale Triplet Dataset and a Coarse-Mask Baseline for In-the-Wild Video Virtual Try-On

cs.CV · 2026-04-30 · unverdicted · novelty 7.0

A new large-scale triplet dataset and diffusion transformer model using coarse human masks deliver improved video virtual try-on quality and generalization in challenging real-world conditions.

Railway Artificial Intelligence Learning Benchmark (RAIL-BENCH): A Benchmark Suite for Perception in the Railway Domain

cs.CV · 2026-04-24 · unverdicted · novelty 7.0

RAIL-BENCH is the first standardized benchmark suite for railway perception with five challenges, real-world datasets, and a novel LineAP metric for rail track detection.

SEM-ROVER: Semantic Voxel-Guided Diffusion for Large-Scale Driving Scene Generation

cs.CV · 2026-04-07 · unverdicted · novelty 7.0

SEM-ROVER generates large multiview-consistent 3D urban driving scenes via semantic-conditioned diffusion on Σ-Voxfield voxel grids with progressive outpainting and deferred rendering.

SegRAG: Training-Free Retrieval-Augmented Semantic Segmentation

cs.CV · 2026-05-17 · unverdicted · novelty 6.0 · 2 refs

SegRAG is a training-free retrieval-augmented framework that extracts class-specific point prompts from a filtered DINOv3 feature bank to boost SAM3 semantic segmentation performance on standard and agricultural benchmarks.

Toward Visually Realistic Simulation: A Benchmark for Evaluating Robot Manipulation in Simulation

cs.RO · 2026-05-07 · unverdicted · novelty 6.0

VISER is a new visually realistic simulation benchmark for robot manipulation tasks that uses PBR materials and MLLM-assisted asset generation, achieving 0.92 Pearson correlation with real-world policy performance.

From Boundaries to Semantics: Prompt-Guided Multi-Task Learning for Petrographic Thin-section Segmentation

cs.CV · 2026-04-16 · unverdicted · novelty 6.0

Petro-SAM adapts SAM via a Merge Block for polarized views plus multi-scale fusion and color-entropy priors to jointly achieve grain-edge and lithology segmentation in petrographic images.

Efficient 3D Content Reconstruction and Generation

cs.CV · 2026-05-18 · unverdicted · novelty 5.0

Presents Instant3D for rapid text/image-to-3D generation via multi-view diffusion plus feed-forward reconstruction, and FastMap for 10x faster structure-from-motion with comparable accuracy.

Efficient Semantic Image Communication for Traffic Monitoring at the Edge

cs.CV · 2026-04-14 · unverdicted · novelty 5.0

MMSD and SAMR achieve 99 percent and 99.1 percent average data reduction for traffic images by transmitting segmentation maps, edges, text or semantically masked JPEGs and reconstructing via diffusion or inpainting models.

Revitalizing Dense Material Segmentation: Stabilized Vision Transformers and the Generalization Paradox

cs.CV · 2026-05-22 · unverdicted · novelty 4.0

Stabilized SegFormer-B5 reaches 0.4572 mIoU SOTA on original Apple DMS split; 80/10/10 split reaches 0.5276 mIoU but degrades real-world OOD performance per qualitative review.

citing papers explorer

Showing 9 of 9 citing papers.

TripVVT: A Large-Scale Triplet Dataset and a Coarse-Mask Baseline for In-the-Wild Video Virtual Try-On cs.CV · 2026-04-30 · unverdicted · none · ref 41
A new large-scale triplet dataset and diffusion transformer model using coarse human masks deliver improved video virtual try-on quality and generalization in challenging real-world conditions.
Railway Artificial Intelligence Learning Benchmark (RAIL-BENCH): A Benchmark Suite for Perception in the Railway Domain cs.CV · 2026-04-24 · unverdicted · none · ref 18
RAIL-BENCH is the first standardized benchmark suite for railway perception with five challenges, real-world datasets, and a novel LineAP metric for rail track detection.
SEM-ROVER: Semantic Voxel-Guided Diffusion for Large-Scale Driving Scene Generation cs.CV · 2026-04-07 · unverdicted · none · ref 32
SEM-ROVER generates large multiview-consistent 3D urban driving scenes via semantic-conditioned diffusion on Σ-Voxfield voxel grids with progressive outpainting and deferred rendering.
SegRAG: Training-Free Retrieval-Augmented Semantic Segmentation cs.CV · 2026-05-17 · unverdicted · none · ref 18 · 2 links
SegRAG is a training-free retrieval-augmented framework that extracts class-specific point prompts from a filtered DINOv3 feature bank to boost SAM3 semantic segmentation performance on standard and agricultural benchmarks.
Toward Visually Realistic Simulation: A Benchmark for Evaluating Robot Manipulation in Simulation cs.RO · 2026-05-07 · unverdicted · none · ref 45
VISER is a new visually realistic simulation benchmark for robot manipulation tasks that uses PBR materials and MLLM-assisted asset generation, achieving 0.92 Pearson correlation with real-world policy performance.
From Boundaries to Semantics: Prompt-Guided Multi-Task Learning for Petrographic Thin-section Segmentation cs.CV · 2026-04-16 · unverdicted · none · ref 22
Petro-SAM adapts SAM via a Merge Block for polarized views plus multi-scale fusion and color-entropy priors to jointly achieve grain-edge and lithology segmentation in petrographic images.
Efficient 3D Content Reconstruction and Generation cs.CV · 2026-05-18 · unverdicted · none · ref 284
Presents Instant3D for rapid text/image-to-3D generation via multi-view diffusion plus feed-forward reconstruction, and FastMap for 10x faster structure-from-motion with comparable accuracy.
Efficient Semantic Image Communication for Traffic Monitoring at the Edge cs.CV · 2026-04-14 · unverdicted · none · ref 33
MMSD and SAMR achieve 99 percent and 99.1 percent average data reduction for traffic images by transmitting segmentation maps, edges, text or semantically masked JPEGs and reconstructing via diffusion or inpainting models.
Revitalizing Dense Material Segmentation: Stabilized Vision Transformers and the Generalization Paradox cs.CV · 2026-05-22 · unverdicted · none · ref 12
Stabilized SegFormer-B5 reaches 0.4572 mIoU SOTA on original Apple DMS split; 80/10/10 split reaches 0.5276 mIoU but degrades real-world OOD performance per qualitative review.

CoRR abs/2105.15203(2021),https://arxiv.org/abs/2105.15203

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer