A Twins-SVT vision transformer backbone with multiscale CNN decoder and Category Focus Module auxiliary task reduces MAE by 33-64% on VisDrone and iSAID multi-class counting benchmarks versus prior density estimators.
Counting dense object of multiple types based on fea- ture enhancement.Frontiers in Neurorobotics, 18:1383943,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Getting the Numbers Right$\unicode{x2014}$Modelling Multi-Class Object Counting in Dense and Varied Scenes
A Twins-SVT vision transformer backbone with multiscale CNN decoder and Category Focus Module auxiliary task reduces MAE by 33-64% on VisDrone and iSAID multi-class counting benchmarks versus prior density estimators.