hub

Fast segment anything

Zhao, X · 2023 · arXiv 2306.12156

20 Pith papers cite this work. Polarity classification is still indexing.

20 Pith papers citing it

read on arXiv browse 20 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 2 method 1

citation-polarity summary

background 2 use method 1

representative citing papers

OpenSGA: Efficient 3D Scene Graph Alignment in the Open World

cs.CV · 2026-05-11 · conditional · novelty 7.0

OpenSGA fuses vision-language, textual, and geometric features via a distance-gated attention encoder and minimum-cost-flow allocator to outperform prior methods on both frame-to-scan and subscan-to-subscan 3D scene graph alignment, backed by a new 700k-sample ScanNet-SG dataset.

LAGO: Language-Guided Adaptive Object-Region Focus for Zero-Shot Visual-Text Alignment

cs.CV · 2026-05-04 · unverdicted · novelty 7.0

LAGO achieves state-of-the-art zero-shot performance with fewer image regions by using class-agnostic object discovery followed by confidence-controlled language-guided refinement and dual-channel aggregation.

Seg2Change: Adapting Open-Vocabulary Semantic Segmentation Model for Remote Sensing Change Detection

cs.CV · 2026-04-13 · conditional · novelty 7.0

Seg2Change adapts open-vocabulary segmentation models to open-vocabulary change detection via a category-agnostic change head and new dataset CA-CDD, delivering +9.52 IoU on WHU-CD and +5.50 mIoU on SECOND.

Boxes2Pixels: Learning Defect Segmentation from Noisy SAM Masks

cs.CV · 2026-04-13 · accept · novelty 7.0

Boxes2Pixels distills noisy SAM pseudo-masks into a compact DINOv2-based student with auxiliary localization and one-sided self-correction, delivering +6.97 anomaly mIoU and +9.71 binary IoU gains over baselines on wind turbine data with 80% fewer parameters.

OmniOVCD: Streamlining Open-Vocabulary Change Detection with SAM 3

cs.CV · 2026-01-20 · conditional · novelty 7.0

OmniOVCD uses SAM 3's decoupled outputs and an SFID strategy to achieve state-of-the-art IoU scores of 67.2, 66.5, 24.5, and 27.1 on four OVCD benchmarks, surpassing prior methods.

P2DNav: Panorama-to-Downview Reasoning for Zero-shot Vision-and-Language Navigation

cs.CV · 2026-05-19 · unverdicted · novelty 6.0

P2DNav proposes a three-part hierarchical framework (panorama-to-downview reasoning, sliding-window dialogue memory, and reflective reorientation) that reports large success-rate gains on the R2R-CE zero-shot VLN benchmark.

SparseSAM: Structured Sparsification of Activations in Segment Anything Models

cs.CV · 2026-05-17 · unverdicted · novelty 6.0

SparseSAM achieves 2x faster inference and 2.8x memory reduction in SAM with only 0.004 mIoU loss at 0.4 density via Stripe-Sort Attention and Residual-Consistency MLP.

StateScribe: Towards Accessible Change Awareness Across Real-World Revisits

cs.HC · 2026-04-26 · unverdicted · novelty 6.0

StateScribe uses a dual-layer memory architecture for episodic scenes and object-centric changes to deliver live and historical descriptions, achieving 83.1% F1 accuracy across revisits in evaluations and user studies with BLV participants.

GRAIL: Autonomous Concept Grounding for Neuro-Symbolic Reinforcement Learning

cs.AI · 2026-04-18 · unverdicted · novelty 6.0

GRAIL autonomously grounds relational concepts in NeSy-RL by using LLM weak supervision followed by interaction-based refinement, matching or exceeding manually defined concepts on Atari games.

H-SPAM: Hierarchical Superpixel Anything Model

cs.CV · 2026-04-13 · conditional · novelty 6.0

H-SPAM produces accurate, regular, and perfectly nested hierarchical superpixels that outperform prior hierarchical methods and match recent non-hierarchical state-of-the-art.

Simulation-Driven Evolutionary Motion Parameterization for Contact-Rich Granular Scooping with a Soft Conical Robotic Hand

cs.RO · 2026-04-07 · unverdicted · novelty 6.0

A deformable soft conical hand is modeled in physics simulation and its scooping trajectories are optimized via evolutionary search, enabling effective contact-rich granular tasks validated in both simulation and physical robot experiments.

AIM-CoT: Active Information-driven Multimodal Chain-of-Thought for Vision-Language Reasoning

cs.CV · 2025-09-30 · unverdicted · novelty 6.0

AIM-CoT enhances interleaved multimodal chain-of-thought reasoning by adding context-enhanced attention generation, active visual probing via information foraging, and dynamic attention-shift triggering.

Terra: Hierarchical Terrain-Aware 3D Scene Graph for Task-Agnostic Outdoor Mapping

cs.RO · 2025-09-23 · unverdicted · novelty 6.0

Terra produces a lightweight task-agnostic metric-semantic 3D scene graph for outdoor environments using terrain-aware place nodes and hierarchically organized regions.

TinySAM 2: Extreme Memory Compression for Efficient Track Anything Model

cs.CV · 2026-05-18 · conditional · novelty 5.0

TinySAM 2 reaches 90% of SAM 2.1 performance on DAVIS and SA-V using 7% of the memory tokens and 3% of the training data via frame selection, spatial average pooling, temporal similarity-based token pruning, and a RepViT image encoder.

FUS3DMaps: Scalable and Accurate Open-Vocabulary Semantic Mapping by 3D Fusion of Voxel- and Instance-Level Layers

cs.RO · 2026-05-05 · unverdicted · novelty 5.0

FUS3DMaps fuses voxel- and instance-level open-vocabulary layers inside a shared 3D voxel map to improve both layers and enable scalable accurate semantic mapping.

A Real-time Scale-robust Network for Glottis Segmentation in Nasal Transnasal Intubation

eess.IV · 2026-04-30 · unverdicted · novelty 5.0

A scale-robust lightweight CNN for glottis segmentation achieves 92.9% mDice at over 170 FPS with a 19 MB model size on three datasets.

Weight Group-wise Post-Training Quantization for Medical Foundation Model

cs.CV · 2026-04-09 · unverdicted · novelty 5.0

Permutation-COMQ is a new post-training quantization algorithm that reorders weights within layers and uses only dot-product and rounding steps to deliver the highest reported accuracy for 2-, 4-, and 8-bit medical foundation models.

On Efficient Variants of Segment Anything Model: A Survey

cs.CV · 2024-10-07 · unverdicted · novelty 5.0

A survey that reviews efficient variants of the Segment Anything Model, categorizes acceleration strategies, and provides a unified hardware evaluation on benchmarks.

Faster Segment Anything: Towards Lightweight SAM for Mobile Applications

cs.CV · 2023-06-25 · conditional · novelty 5.0

MobileSAM is a 60x smaller distilled version of SAM that matches original performance and runs 5x faster than concurrent FastSAM while supporting CPU inference.

Semantic-Fast-SAM: Efficient Semantic Segmenter

cs.CV · 2026-04-22 · unverdicted · novelty 3.0

Semantic-Fast-SAM matches prior SAM-based semantic segmentation accuracy on Cityscapes and ADE20K while running about 20 times faster by combining FastSAM with SSA labeling and CLIP for open-vocabulary cases.

citing papers explorer

Showing 1 of 1 citing paper after filters.

LAGO: Language-Guided Adaptive Object-Region Focus for Zero-Shot Visual-Text Alignment cs.CV · 2026-05-04 · unverdicted · none · ref 40
LAGO achieves state-of-the-art zero-shot performance with fewer image regions by using class-agnostic object discovery followed by confidence-controlled language-guided refinement and dual-channel aggregation.

Fast segment anything

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer