org/abs/2411.14347

URLhttps://arxiv · 2024 · arXiv 2411.14347

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

representative citing papers

Vision Harnessing Agent for Open Ad-hoc Segmentation

cs.CV · 2026-05-19 · unverdicted · novelty 7.0

VASA is a vision-guided agent for open ad-hoc segmentation that creates and validates masks through planning, tool use, and error recovery, outperforming baselines on the new PARS benchmark and RefCOCOm.

SAM 3: Segment Anything with Concepts

cs.CV · 2025-11-20 · unverdicted · novelty 7.0

SAM 3 introduces promptable concept segmentation that doubles accuracy of prior systems on images and videos while improving standard SAM segmentation performance.

DeFacto: Counterfactual Thinking with Images for Enforcing Evidence-Grounded and Faithful Reasoning

cs.AI · 2025-09-25 · unverdicted · novelty 6.0 · 2 refs

DeFacto trains multimodal models with counterfactual image variants and GRPO reinforcement learning to enforce that correct answers are supported by correct visual evidence.

See&Say: Vision Language Guided Safe Zone Detection for Autonomous Package Delivery Drones

cs.CV · 2026-04-14 · unverdicted · novelty 5.0

See&Say combines depth gradients, semantic masks, and VLM-guided refinement to generate safety maps and alternative drop zones for autonomous drone deliveries, outperforming baselines in accuracy and IoU.

Image Generators are Generalist Vision Learners

cs.CV · 2026-04-22 · 2 refs

citing papers explorer

Showing 5 of 5 citing papers.

Vision Harnessing Agent for Open Ad-hoc Segmentation cs.CV · 2026-05-19 · unverdicted · none · ref 52
VASA is a vision-guided agent for open ad-hoc segmentation that creates and validates masks through planning, tool use, and error recovery, outperforming baselines on the new PARS benchmark and RefCOCOm.
SAM 3: Segment Anything with Concepts cs.CV · 2025-11-20 · unverdicted · none · ref 117
SAM 3 introduces promptable concept segmentation that doubles accuracy of prior systems on images and videos while improving standard SAM segmentation performance.
DeFacto: Counterfactual Thinking with Images for Enforcing Evidence-Grounded and Faithful Reasoning cs.AI · 2025-09-25 · unverdicted · none · ref 20 · 2 links
DeFacto trains multimodal models with counterfactual image variants and GRPO reinforcement learning to enforce that correct answers are supported by correct visual evidence.
See&Say: Vision Language Guided Safe Zone Detection for Autonomous Package Delivery Drones cs.CV · 2026-04-14 · unverdicted · none · ref 22
See&Say combines depth gradients, semantic masks, and VLM-guided refinement to generate safety maps and alternative drop zones for autonomous drone deliveries, outperforming baselines in accuracy and IoU.
Image Generators are Generalist Vision Learners cs.CV · 2026-04-22 · unreviewed · ref 21 · 2 links

org/abs/2411.14347

fields

years

verdicts

representative citing papers

citing papers explorer