A Twins-SVT vision transformer backbone with multiscale CNN decoder and Category Focus Module auxiliary task reduces MAE by 33-64% on VisDrone and iSAID multi-class counting benchmarks versus prior density estimators.
Object detection in 20 years: A survey.Proceed- ings of the IEEE, 111(3):257–276, 2023
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2verdicts
UNVERDICTED 2representative citing papers
XD-MAP generates pseudo labels for LiDAR semantic segmentation from camera images using parametric maps, improving 2D and 3D segmentation performance by up to 32.3 mIoU without manual labeling.
citing papers explorer
-
Getting the Numbers Right$\unicode{x2014}$Modelling Multi-Class Object Counting in Dense and Varied Scenes
A Twins-SVT vision transformer backbone with multiscale CNN decoder and Category Focus Module auxiliary task reduces MAE by 33-64% on VisDrone and iSAID multi-class counting benchmarks versus prior density estimators.
-
XD-MAP: Cross-Modal Domain Adaptation via Semantic Parametric Maps for Scalable Training Data Generation
XD-MAP generates pseudo labels for LiDAR semantic segmentation from camera images using parametric maps, improving 2D and 3D segmentation performance by up to 32.3 mIoU without manual labeling.