Introduces Robust-TOOC benchmark for corrupted images and Dual-TTT test-time training that updates only a text-guided denoising module to boost robustness in open-vocabulary counting.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
Introduces LCNet using Multi-Modal Hyper-Graph Fusion and Deformable Rectangular Sparse Attention to achieve SOTA low-light crowd counting on three new benchmarks.
MambaCount uses S^4D blocks with spatial token selection and multi-granularity prototypes to reach MAE 12.23 on FSC-147 for open-vocabulary counting while keeping linear complexity.
citing papers explorer
-
Test-Time Training for Robust Text-Guided Open-Vocabulary Object Counting
Introduces Robust-TOOC benchmark for corrupted images and Dual-TTT test-time training that updates only a text-guided denoising module to boost robustness in open-vocabulary counting.
-
Multi-Modal Hyper-Graph Fusion for Low-Light Crowd Counting
Introduces LCNet using Multi-Modal Hyper-Graph Fusion and Deformable Rectangular Sparse Attention to achieve SOTA low-light crowd counting on three new benchmarks.
-
MambaCount: Efficient Text-guided Open-vocabulary Object Counting with Spatial Sparse State Space Duality Block
MambaCount uses S^4D blocks with spatial token selection and multi-granularity prototypes to reach MAE 12.23 on FSC-147 for open-vocabulary counting while keeping linear complexity.