pith. machine review for the scientific record. sign in

arxiv: 2604.05271 · v1 · submitted 2026-04-07 · 💻 cs.CV

Recognition: no theorem link

Toward Unified Fine-Grained Vehicle Classification and Automatic License Plate Recognition

Authors on Pith no claims yet

Pith reviewed 2026-05-10 18:50 UTC · model grok-4.3

classification 💻 cs.CV
keywords fine-grained vehicle classificationautomatic license plate recognitionsurveillance datasetdeep learning benchmarkreal-world conditionsvehicle attributesALPRFGVC
0
0 comments X

The pith

A new real-world dataset validates fine-grained vehicle attributes using license plate data and benchmarks their joint use with automatic recognition.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces UFPR-VeSV, a collection of 24,945 surveillance images showing 16,297 distinct vehicles under varied conditions including nighttime infrared and partial occlusions. It supplies annotations for 13 colors, 26 makes, 136 models, and 14 types, with all labels cross-checked against the visible license plate text and corners. Five deep learning models are tested to expose practical difficulties such as multicolored cars and platform-sharing models that look alike. The work also runs optical character recognition on the plates and examines whether feeding fine-grained classification results back into plate reading improves outcomes. This matters for traffic systems and investigations that need reliable vehicle identification when single cues fail.

Core claim

We introduce UFPR-VeSV, a dataset comprising 24,945 images of 16,297 unique vehicles with annotations for 13 colors, 26 makes, 136 models, and 14 types. Collected from the Military Police of Paraná surveillance system, the dataset captures diverse real-world conditions, including partial occlusions, nighttime infrared imaging, and varying lighting. All FGVC annotations were validated using license plate information, with text and corner annotations also being provided. A benchmark using five deep learning models further validated this, revealing specific challenges such as handling multicolored vehicles, infrared images, and distinguishing between vehicle models that share a common platform.

What carries the argument

The UFPR-VeSV dataset, whose fine-grained vehicle labels are cross-validated against license-plate text and corners extracted from the same surveillance images.

If this is right

  • Deep learning models must be trained to handle infrared frames and vehicles with multiple colors on the same body.
  • Models still struggle to separate vehicle variants built on identical platforms even when given large annotated sets.
  • Outputs from fine-grained classification can be combined with plate text to resolve cases where either cue alone is insufficient.
  • The dataset supports development of systems that operate under partial occlusion and changing illumination without controlled lighting.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Cross-checking visual attributes against plate data could be applied to improve training sets for other vehicle recognition tasks where ground truth is otherwise expensive to obtain.
  • An end-to-end network that predicts both attribute vector and plate string in one forward pass might reduce error propagation between the two tasks.
  • The identified failure modes suggest that specialized data augmentation for nighttime and occluded views would be a direct next step for practitioners.
  • Law-enforcement pipelines could use the joint outputs to flag inconsistencies between reported vehicle details and observed plates.

Load-bearing premise

License plate readings always supply correct and unambiguous ground truth for a vehicle's color, make, model, and type.

What would settle it

A collection of images in which the visible vehicle body and the information readable from its license plate systematically disagree, such as through plate swaps or misreads, would show whether the validation step holds.

Figures

Figures reproduced from arXiv: 2604.05271 by David Menotti, Eduardo Santos, Eduil Nascimento Jr, Gabriel E. Lima, Rayson Laroca, Valfride Nascimento.

Figure 2
Figure 2. Figure 2: Examples of images featuring multiple vehicles due to camera perspective. The background vehicle is highlighted with a green border, while the main vehicle is shadowed to enhance contrast. The dataset spans a wide temporal range, including both daytime and nighttime conditions. While timestamps are not available, images are categorized by the camera’s capture mode. Nighttime images, primarily captured in i… view at source ↗
Figure 1
Figure 1. Figure 1: Distribution of vehicles across the attributes of color (a), make (b), model (c) and type (d) in the UFPR-VeSV dataset. For better visualization, only the 30 most common vehicle models are displayed in (c), representing 63.7% of the total images. categorized as either front or rear view based on the visibility of the LP. As a result, the dataset contains 13,842 rear-view and 11,103 frontal-view images. The… view at source ↗
Figure 4
Figure 4. Figure 4: Example of LPs cropped from images captured under diverse conditions, showcasing variations in resolution, perspective, and image quality. The corresponding annotated LP text is shown below each image. Regarding ALPR, the UFPR-VeSV dataset includes annota￾tions for both LP characters and corner coordinates [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: Example of border standardization. (a) original image with a green border; (b) result after removing a 5-pixel margin from all sides. Finally, it is important to highlight that the images were not resized to uniform dimensions. As a result, the dataset retains a variety of image sizes, with widths ranging from 89 to 2,110 pixels and heights from 135 to 1,408 pixels. This approach prevents distortions that … view at source ↗
Figure 7
Figure 7. Figure 7: Examples of vehicles assigned to the “unknown” class for color, make, and model annotations: (a) a red Yamaha motorcycle; (b) a white Volkswagen truck; (c) a red Mercedes-Benz truck. In all cases, the main body of the vehicle is obstructed, rendering the identification of color, make, and model impossible. Classes with fewer than 25 samples were adjusted to reduce extreme class imbalance. Underrepresented … view at source ↗
Figure 8
Figure 8. Figure 8: Representative images from three public datasets and UFPR-VeSV. Our dataset features significantly more challenging scenarios, with vehicles captured from diverse viewpoints, environments, lighting conditions, image quality levels, and nighttime infrared imaging. vehicles captured from multiple viewpoints and under varied real-world conditions. A common limitation among existing datasets is the lack of sce… view at source ↗
Figure 10
Figure 10. Figure 10: Examples of misclassified vehicle models from different manu￾facturers. Original make and model are displayed below each image. In (a), a rear-view Boxer was misclassified as a Ducato, while in (b) a rear-view Ducato was misclassified as a Boxer. Aside from minor brand-specific markings, the vehicles have a very similar structure, making accurate differentiation chal￾lenging. In type recognition, misclass… view at source ↗
Figure 9
Figure 9. Figure 9: Examples of misclassified multicolored vehicle from different viewpoints. The predicted color is shown below each image. Depending on the camera angle and illumination, a single color can appear dominant, which causes the classifier to classify the vehicle based on that one color. For make recognition, the “others” class had the lowest performance (≈ 30%). This suggests that a superclass for less common ma… view at source ↗
Figure 11
Figure 11. Figure 11: Grad-CAM [Selvaraju et al., 2017] attention maps for a Volkswa￾gen Parati. (a) Original image. (b) The make recognition map, focusing on the manufacturer’s badge. (c) The model recognition map, relying on other features, such as the headlights. This analysis reveals a key limitation for practical FGVC: isolated models are insufficient, as they can produce logi￾cally inconsistent results. Methods must, the… view at source ↗
Figure 12
Figure 12. Figure 12: Example of misrecognized LPs. For each image, the ground-truth label is shown above the model’s prediction, with incorrectly recognized characters highlighted in red. The failure cases include severely degraded characters (a, h, i), illumination-induced obstructions (b), low-contrast char￾acters (c, j), blurring (d, g), and physically deformed LPs (e, f). 6.2 Joint System Analysis The previous section sho… view at source ↗
Figure 13
Figure 13. Figure 13: Examples of ALPR failures where the LP is illegible due to (a) light glare and (b) occlusion. Despite these failures, our FGVC classifier acts as a fail-safe, correctly identifying the type, make, and model for both vehicles. Color recognition is correct for (b), while the infrared image (a) is classified as “unknown” due to the absence of color information. Finally, we acknowledge this initial analysis h… view at source ↗
read the original abstract

Extracting vehicle information from surveillance images is essential for intelligent transportation systems, enabling applications such as traffic monitoring and criminal investigations. While Automatic License Plate Recognition (ALPR) is widely used, Fine-Grained Vehicle Classification (FGVC) offers a complementary approach by identifying vehicles based on attributes such as color, make, model, and type. Although there have been advances in this field, existing studies often assume well-controlled conditions, explore limited attributes, and overlook FGVC integration with ALPR. To address these gaps, we introduce UFPR-VeSV, a dataset comprising 24,945 images of 16,297 unique vehicles with annotations for 13 colors, 26 makes, 136 models, and 14 types. Collected from the Military Police of Paran\'a (Brazil) surveillance system, the dataset captures diverse real-world conditions, including partial occlusions, nighttime infrared imaging, and varying lighting. All FGVC annotations were validated using license plate information, with text and corner annotations also being provided. A qualitative and quantitative comparison with established datasets confirmed the challenging nature of our dataset. A benchmark using five deep learning models further validated this, revealing specific challenges such as handling multicolored vehicles, infrared images, and distinguishing between vehicle models that share a common platform. Additionally, we apply two optical character recognition models to license plate recognition and explore the joint use of FGVC and ALPR. The results highlight the potential of integrating these complementary tasks for real-world applications. The UFPR-VeSV dataset is publicly available at: https://github.com/Lima001/UFPR-VeSV-Dataset.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript introduces the UFPR-VeSV dataset comprising 24,945 images of 16,297 unique vehicles captured from Brazilian police surveillance cameras under diverse real-world conditions including occlusions, nighttime infrared, and varying lighting. It provides FGVC annotations for 13 colors, 26 makes, 136 models, and 14 types, plus license plate text and corner annotations, with the claim that all FGVC labels were validated using license plate information from police records. The work benchmarks five deep learning models on FGVC, identifies concrete challenges (multicolored vehicles, infrared images, platform-sharing models), compares the dataset qualitatively/quantitatively to prior collections, and explores joint FGVC-ALPR using two OCR models, with the dataset released publicly.

Significance. If the annotations are verifiably accurate, the dataset offers a useful public resource for real-world vehicle recognition research by combining scale, attribute richness, and challenging conditions not fully covered in existing collections. The explicit identification of model failure modes and the preliminary joint FGVC-ALPR experiments provide actionable insights for intelligent transportation applications.

major comments (1)
  1. [Dataset description and annotation validation] Dataset description and annotation validation (abstract and corresponding methods section): the central claim that 'All FGVC annotations were validated using license plate information' is load-bearing for the reliability of the reported benchmarks and the identified challenges (e.g., distinguishing platform-sharing models). However, no quantitative validation error rate, description of the matching procedure (e.g., registry lookup under OCR failures or plate swaps), or independent visual re-annotation cross-check is supplied. This omission leaves the ground-truth quality unquantified and risks systematic label noise affecting the benchmark conclusions.
minor comments (2)
  1. [Abstract] Abstract: limited detail on training protocols, exact performance numbers, and statistical significance of the joint FGVC-ALPR experiments reduces the ability to assess results at a glance; adding key metrics would strengthen the summary.
  2. [Comparison section] Comparison with established datasets: ensure the qualitative/quantitative comparison section explicitly cites prior FGVC and ALPR datasets and tabulates key differences (e.g., attribute coverage, condition diversity) for clarity.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback, which helps strengthen the presentation of our dataset and its validation. We address the major comment below and will revise the manuscript to provide the requested details.

read point-by-point responses
  1. Referee: Dataset description and annotation validation (abstract and corresponding methods section): the central claim that 'All FGVC annotations were validated using license plate information' is load-bearing for the reliability of the reported benchmarks and the identified challenges (e.g., distinguishing platform-sharing models). However, no quantitative validation error rate, description of the matching procedure (e.g., registry lookup under OCR failures or plate swaps), or independent visual re-annotation cross-check is supplied. This omission leaves the ground-truth quality unquantified and risks systematic label noise affecting the benchmark conclusions.

    Authors: We agree that the current manuscript provides insufficient detail on the annotation validation process. In the revised version, we will expand the methods section with a new subsection describing the validation procedure in full. This will include: (i) the exact matching workflow between image-derived license plates and the police registry records, (ii) explicit handling of OCR failures and potential plate swaps or mismatches, and (iii) the results of an independent visual re-annotation cross-check performed on a random subset of samples. We will also report the quantitative validation error rate (number of discrepancies found and how they were resolved). These additions will allow readers to assess ground-truth reliability directly and mitigate concerns about label noise. revision: yes

Circularity Check

0 steps flagged

No circularity: dataset release and standard benchmarks are self-contained

full rationale

The paper introduces the UFPR-VeSV dataset with FGVC annotations (validated via external license plate records from police sources) and runs benchmarks on five off-the-shelf deep learning models. No equations, parameter fitting, or predictions appear in the provided text. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes. The annotation validation step is a procedural claim about data collection, not a derivation that reduces to its own inputs by construction. This matches the default expectation of no significant circularity for a dataset-plus-benchmark paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are introduced; the contribution rests on standard image collection, annotation via existing license plate data, and off-the-shelf deep learning models.

pith-pipeline@v0.9.0 · 5607 in / 1129 out tokens · 33299 ms · 2026-05-10T18:50:15.539743+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

79 extracted references · 71 canonical work pages · 1 internal anchor

  1. [1]

    write newline

    " write newline "" before.all 'output.state := FUNCTION fin.entry add.period write newline FUNCTION new.sentence output.state after.block = 'skip output.state before.all = 'skip after.sentence 'output.state := if if FUNCTION not #0 #1 if FUNCTION and 'skip pop #0 if FUNCTION or pop #1 'skip if FUNCTION new.block.checkb empty swap empty and 'skip 'new.bloc...

  2. [2]

    and Barshooi, A

    Amirkhani, A. and Barshooi, A. H. (2023). Deepcar 5.0: Vehicle make and model recognition under challenging conditions. IEEE Transactions on Intelligent Transportation Systems , 24(1):541--553. doi:10.1109/TITS.2022.3212921 https://doi.org/10.1109/TITS.2022.3212921

  3. [3]

    Baek, N., Park, S.-M., Kim, K.-J., and Park, S.-B. (2007). Vehicle color classification based on the support vector machine method. In International Conference on Intelligent Computing , pages 1133--1139. doi:10.1007/978-3-540-74282-1\_127 https://doi.org/10.1007/978-3-540-74282-1\_127

  4. [4]

    and Suresh, S

    Basak, S. and Suresh, S. (2024). Vehicle detection and type classification in low resolution congested traffic scenes using image super resolution. Multimedia Tools and Applications , 83(8):21825--21847. doi:10.1007/s11042-023-16337-2 https://doi.org/10.1007/s11042-023-16337-2

  5. [5]

    and Atienza, R

    Bautista, D. and Atienza, R. (2022). Scene text recognition with permuted autoregressive sequence models. In European Conference on Computer Vision (ECCV) , pages 178--196. doi:10.1007/978-3-031-19815-1\_11 https://doi.org/10.1007/978-3-031-19815-1\_11

  6. [6]

    Caruana, R. (1997). Multitask learning. Machine learning , 28:41--75. doi:10.1023/A:1007379606734 https://doi.org/10.1023/A:1007379606734

  7. [7]

    Celestino, M. (2021). 10 marcas que mais venderam carros na década. https://www.webmotors.com.br/wm1/noticias/10-marcas-que-mais-venderam-carros-na-decada. Accessed: 2025-02-19

  8. [8]

    Chen, P., Bai, X., and Liu, W. (2014). Vehicle color recognition on urban road by feature context. IEEE Transactions on Intelligent Transportation Systems , 15(5):2340--2346. doi:10.1109/TITS.2014.2308897 https://doi.org/10.1109/TITS.2014.2308897

  9. [9]

    Trevor Hastie, Andrea Montanari, Saharon Rosset, and Ryan J

    Cubuk, E. D., Zoph, B., Shlens, J., and Le, Q. V. (2020). Randaugment: Practical automated data augmentation with a reduced search space. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) , pages 3008--3017. doi:10.1109/CVPRW50498.2020.00359 https://doi.org/10.1109/CVPRW50498.2020.00359

  10. [10]

    Deng, J., Guo, J., Ververas, E., Kotsia, I., and Zafeiriou, S. (2020). RetinaFace : Single-shot multi-level face localisation in the wild. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages 5202--5211. doi:10.1109/CVPR42600.2020.00525 https://doi.org/10.1109/CVPR42600.2020.00525

  11. [11]

    Deng, J., Krause, J., and Fei-Fei, L. (2013). Fine-grained crowdsourcing for fine-grained recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . doi:10.1109/CVPR.2013.81 https://doi.org/10.1109/CVPR.2013.81

  12. [12]

    Dong, Z., Wu, Y., Pei, M., and Jia, Y. (2015). Vehicle type classification using a semisupervised convolutional neural network. IEEE Transactions on Intelligent Transportation Systems , 16(4):2247--2256. doi:10.1109/TITS.2015.2402438 https://doi.org/10.1109/TITS.2015.2402438

  13. [13]

    Dosovitskiy , A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR) , pages 1--22

  14. [14]

    Du, Y., Chen, Z., Su, Y., Jia, C., and Jiang, Y.-G. (2025). Instruction-guided scene text recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence , pages 1--16. doi:10.1109/TPAMI.2025.3525526 https://doi.org/10.1109/TPAMI.2025.3525526

  15. [15]

    Dule, E., G\" o kmen, M., and Berato g lu, M. S. (2010). A convenient feature vector construction for vehicle color recognition. In WSEAS International Conference on Neural Networks, Evolutionary Computing and Fuzzy systems , page 250–255. doi:10.5555/1863431.1863473 https://doi.org/10.5555/1863431.1863473

  16. [16]

    and Zhao, W

    Fan, X. and Zhao, W. (2022). Improving robustness of license plates automatic recognition in natural scenes. IEEE Transactions on Intelligent Transportation Systems , 23(10):18845--18854. doi:10.1109/TITS.2022.3151475 https://doi.org/10.1109/TITS.2022.3151475

  17. [17]

    and Croquer, G

    Farias, V. and Croquer, G. (2023). Por que o carro colorido sumiu? 67\ https://g1.globo.com/economia/noticia/2023/08/20/por-que-o-carro-colorido-sumiu-67percent-dos-veiculos-no-brasil-sao-brancos-pretos-ou-cinzas.ghtml. Accessed: 2025-02-19

  18. [18]

    M., Worrall, A

    Ferryman, J. M., Worrall, A. D., Sullivan, G. D., and Baker, K. D. (1995). A generic deformable model for vehicle recognition. In British Machine Vision Conference (BMVC) , page 127–136. doi:10.5555/236190.236202 https://doi.org/10.5555/236190.236202

  19. [19]

    Fu, H., Ma, H., Wang, G., Zhang, X., and Zhang, Y. (2020). MCFF-CNN : Multiscale comprehensive feature fusion convolutional neural network for vehicle color recognition based on residual learning. Neurocomputing , 395:178--187. doi:10.1016/j.neucom.2018.02.111 https://doi.org/10.1016/j.neucom.2018.02.111

  20. [20]

    Selective classification for deep neural networks.Advances in Neural Information Processing Systems, 2017

    Geifman, Y. and El-Yaniv, R. (2017). Selective classification for deep neural networks. In International Conference on Neural Information Processing Systems (NeurIPS) , page 4885–4894. doi:10.5555/3295222.3295241 https://doi.org/10.5555/3295222.3295241

  21. [21]

    https://doi.org/10.1109/fg.2018.00021

    Gon c alves , G. R., Diniz , M. A., Laroca , R., Menotti , D., and Schwartz , W. R. (2018). Real-time automatic license plate recognition through deep multi-task networks. In Conference on Graphics, Patterns and Images (SIBGRAPI) , pages 110--117. doi:10.1109/SIBGRAPI.2018.00021 https://doi.org/10.1109/SIBGRAPI.2018.00021

  22. [22]

    Guo, C., Pleiss, G., Sun, Y., and Weinberger, K. Q. (2017). On calibration of modern neural networks. In Precup, D. and Teh, Y. W., editors, Proceedings of the 34th International Conference on Machine Learning , volume 70 of Proceedings of Machine Learning Research , pages 1321--1330. PMLR. doi:10.5555/3305381.3305518 https://doi.org/10.5555/3305381.3305518

  23. [23]

    Han, K., Xiao, A., Wu, E., Guo, J., XU, C., and Wang, Y. (2021). Transformer in transformer. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W., editors, Advances in Neural Information Processing Systems , volume 34, pages 15908--15919. Curran Associates, Inc. doi:10.5555/3540261.3541478 https://doi.org/10.5555/3540261.3541478

  24. [24]

    M., and Tahir, M

    Hassan, A., Ali, M., Durrani, N. M., and Tahir, M. A. (2021). An empirical analysis of deep learning architectures for vehicle make and model recognition. IEEE Access , 9:91487--91499. doi:10.1109/ACCESS.2021.3090766 https://doi.org/10.1109/ACCESS.2021.3090766

  25. [25]

    He, C., Wang, D., Cai, Z., Zeng, J., and Fu, F. (2024a). A vehicle matching algorithm by maximizing travel time probability based on automatic license plate recognition data. IEEE Transactions on Intelligent Transportation Systems , 25(8):9103--9114. doi:10.1109/TITS.2024.3358625 https://doi.org/10.1109/TITS.2024.3358625

  26. [26]

    He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pages 770--778. doi:10.1109/CVPR.2016.90 https://doi.org/10.1109/CVPR.2016.90

  27. [27]

    He, L., Zhou, Y., Liu, L., and Ma, J. (2024b). Research and application of YOLOv11 -based object segmentation in intelligent recognition at construction sites. Buildings , 14(12). doi:10.3390/buildings14123777 https://doi.org/10.3390/buildings14123777

  28. [28]

    Howard, A., Sandler, M., Chen, B., Wang, W., Chen, L.-C., Tan, M., Chu, G., Vasudevan, V., Zhu, Y., Pang, R., Adam, H., and Le, Q. (2019). Searching for MobileNetV3 . In IEEE/CVF International Conference on Computer Vision (ICCV) , pages 1314--1324. doi:10.1109/ICCV.2019.00140 https://doi.org/10.1109/ICCV.2019.00140

  29. [29]

    Hsu, G.-S., Chen, J.-C., and Chung, Y.-Z. (2013). Application-oriented license plate recognition. IEEE Transactions on Vehicular Technology , 62(2):552--561. doi:10.1109/TVT.2012.2226218 https://doi.org/10.1109/TVT.2012.2226218

  30. [30]

    Hu, B., Lai, J.-H., and Guo, C.-C. (2017). Location-aware fine-grained vehicle type recognition using multi-task deep networks. Neurocomputing , 243:60--68. doi:10.1016/j.neucom.2017.02.085 https://doi.org/10.1016/j.neucom.2017.02.085

  31. [31]

    Hu, C., Bai, X., Qi, L., Chen, P., Xue, G., and Mei, L. (2015). Vehicle color recognition with spatial pyramid deep learning. IEEE Transactions on Intelligent Transportation Systems , 16(5):2925--2934. doi:10.1109/TITS.2015.2430892 https://doi.org/10.1109/TITS.2015.2430892

  32. [32]

    Hu, M., Bai, L., Fan, J., Zhao, S., and Chen, E. (2023). Vehicle color recognition based on smooth modulation neural network with multi-scale feature fusion. Frontiers of Computer Science , 17(3):173321. doi:10.1007/s11704-022-1389-x https://doi.org/10.1007/s11704-022-1389-x

  33. [33]

    C., and Tang, X

    Huang, C., Li, Y., Loy, C. C., and Tang, X. (2016). Learning deep representation for imbalanced classification. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pages 5375--5384. doi:10.1109/CVPR.2016.580 https://doi.org/10.1109/CVPR.2016.580

  34. [34]

    Huang, G., Liu, Z., van der Maaten, L., and Weinberger, K. Q. (2017). Densely connected convolutional networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . doi:10.1109/CVPR.2017.243 https://doi.org/10.1109/CVPR.2017.243

  35. [35]

    Jolly, M.-P., Lakshmanan, S., and Jain, A. (1996). Vehicle segmentation and classification using deformable templates. IEEE Transactions on Pattern Analysis and Machine Intelligence , 18(3):293--308. doi:10.1109/34.485557 https://doi.org/10.1109/34.485557

  36. [36]

    YOLOv11: An Overview of the Key Architectural Enhancements

    Khanam, R. and Hussain, M. (2024). YOLOv11 : An overview of the key architectural enhancements. arXiv preprint . doi:10.48550/arXiv.2410.17725 https://doi.org/10.48550/arXiv.2410.17725

  37. [37]

    Krause, J., Deng, J., Stark, M., and Fei-Fei, L. (2013a). Collecting a large-scale dataset of fine-grained cars. In Second Workshop on Fine-Grained Visual Categorisation (FGVC), in conjunction with CVPR . available at https://ai.stanford.edu/ jkrause/papers/fgvc13.pdf

  38. [38]

    Krause, J., Stark, M., Deng, J., and Fei-Fei, L. (2013b). 3d object representations for fine-grained categorization. In 2013 IEEE International Conference on Computer Vision Workshops , pages 554--561. doi:10.1109/ICCVW.2013.77 https://doi.org/10.1109/ICCVW.2013.77

  39. [39]

    Kuhn, D. M. and Moreira, V. P. (2021). BRCars : a dataset for fine-grained classification of car images. In 2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI) , pages 231--238. doi:10.1109/SIBGRAPI54419.2021.00039 https://doi.org/10.1109/SIBGRAPI54419.2021.00039

  40. [40]

    Lai, A., Fung, G., and Yung, N. (2001). Vehicle type classification from visual-based dimension estimation. In IEEE Intelligent Transportation Systems Conference (ITSC) , pages 201--206. doi:10.1109/ITSC.2001.948656 https://doi.org/10.1109/ITSC.2001.948656

  41. [41]

    B., Zanlorensi , L

    Laroca , R., Araujo , A. B., Zanlorensi , L. A., De Almeida , E. C., and Menotti , D. (2021). Towards image-based automatic meter reading in unconstrained scenarios: A robust and efficient approach. IEEE Access , 9:67569--67584. doi:10.1109/ACCESS.2021.3077415 https://doi.org/10.1109/ACCESS.2021.3077415

  42. [42]

    V., Lucio , D

    Laroca , R., Cardoso , E. V., Lucio , D. R., Estevam , V., and Menotti , D. (2022). On the cross-dataset generalization in license plate recognition. In International Conference on Computer Vision Theory and Applications (VISAPP) , pages 166--178. doi:10.5220/0010846800003124 https://doi.org/10.5220/0010846800003124

  43. [43]

    Laroca , R., Estevam , V., Britto Jr. , A. S., Minetto , R., and Menotti , D. (2023a). Do we train on test data? T he impact of near-duplicates on license plate recognition. In International Joint Conference on Neural Networks (IJCNN) , pages 1--8. doi:10.1109/IJCNN54540.2023.10191584 https://doi.org/10.1109/IJCNN54540.2023.10191584

  44. [44]

    Laroca , R., Estevam , V., Moreira , G. J. P., Minetto , R., and Menotti , D. (2025). Advancing multinational license plate recognition through synthetic and real data fusion: A comprehensive evaluation. IET Intelligent Transport Systems , 19(1):e70086. doi:10.1049/itr2.70086 https://doi.org/10.1049/itr2.70086

  45. [45]

    A Robust Real-Time Automatic License Plate Recognition Based on the

    Laroca , R., Severo , E., Zanlorensi , L. A., Oliveira , L. S., Gon c alves , G. R., Schwartz , W. R., and Menotti , D. (2018). A robust real-time automatic license plate recognition based on the YOLO detector. In International Joint Conference on Neural Networks (IJCNN) , pages 1--10. doi:10.1109/IJCNN.2018.8489629 https://doi.org/10.1109/IJCNN.2018.8489629

  46. [46]

    A., Estevam , V., Minetto , R., and Menotti , D

    Laroca , R., Zanlorensi , L. A., Estevam , V., Minetto , R., and Menotti , D. (2023b). Leveraging model fusion for improved license plate recognition. In Iberoamerican Congress on Pattern Recognition (CIARP) , pages 60--75. doi:10.1007/978-3-031-49249-5\_5 https://doi.org/10.1007/978-3-031-49249-5\_5

  47. [47]

    O., Schwartz, W

    Lima , G. E., Laroca , R., Santos , E., Nascimento Jr. , E., and Menotti , D. (2024). Toward enhancing vehicle color recognition in adverse conditions: A dataset and benchmark. In Conference on Graphics, Patterns and Images (SIBGRAPI) , pages 1--6. doi:10.1109/SIBGRAPI62404.2024.10716307 https://doi.org/10.1109/SIBGRAPI62404.2024.10716307

  48. [48]

    Liu, Q., Chen, S.-L., Chen, Y.-X., and Yin, X.-C. (2024). Improving license plate recognition via diverse stylistic plate generation. Pattern Recognition Letters , 183:117--124. doi:10.1016/j.patrec.2024.05.005 https://doi.org/10.1016/j.patrec.2024.05.005

  49. [49]

    Liu , Y.-Y., Liu, Q., Chen, S.-L., Chen, F., and Yin, X.-C. (2024). Irregular license plate recognition via global information integration. In International Conference on Multimedia Modeling , pages 325--339. doi:10.1007/978-3-031-53308-2\_24 https://doi.org/10.1007/978-3-031-53308-2\_24

  50. [50]

    Liu, Z., Hu, H., Lin, Y., Yao, Z., Xie, Z., Wei, Y., Ning, J., Cao, Y., Zhang, Z., Dong, L., Wei, F., and Guo, B. (2022). Swin transformer v2: Scaling up capacity and resolution. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages 11999--12009. doi:10.1109/CVPR52688.2022.01170 https://doi.org/10.1109/CVPR52688.2022.01170

  51. [51]

    Lu, L., Cai, Y., Huang, H., and Wang, P. (2023). An efficient fine-grained vehicle recognition method based on part-level feature optimization. Neurocomputing , 536:40--49. doi:10.1016/j.neucom.2023.03.035 https://doi.org/10.1016/j.neucom.2023.03.035

  52. [52]

    Fully Dynamic Maximal Independent Set with Polylogarithmic Update Time , booktitle =

    Lucio, D. R., Laroca, R., Zanlorensi, L. A., Moreira, G., and Menotti, D. (2019). Simultaneous iris and periocular region detection using coarse annotations. In Conference on Graphics, Patterns and Images (SIBGRAPI) , pages 178--185. doi:10.1109/SIBGRAPI.2019.00032 https://doi.org/10.1109/SIBGRAPI.2019.00032

  53. [53]

    Luo, R., Song, Y., Ye, L., and Su, R. (2024). Dense-tnt: Efficient vehicle type classification neural network using satellite imagery. Sensors , 24(23). doi:10.3390/s24237662 https://doi.org/10.3390/s24237662

  54. [54]

    and Grimson, W

    Ma, X. and Grimson, W. (2005). Edge-based rich representation for vehicle classification. In IEEE International Conference on Computer Vision (ICCV) , pages 1185--1192. doi:10.1109/ICCV.2005.80 https://doi.org/10.1109/ICCV.2005.80

  55. [55]

    Minderer, M., Djolonga, J., Romijnders, R., Hubis, F., Zhai, X., Houlsby, N., Tran, D., and Lucic, M. (2021). Revisiting the calibration of modern neural networks. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W., editors, Advances in Neural Information Processing Systems , volume 34, pages 15682--15694. Curran Associates, Inc. ...

  56. [56]

    Frota nacional (junho de 2024)

    Ministério dos Transportes (2024). Frota nacional (junho de 2024). https://www.gov.br/transportes/pt-br/assuntos/transito/conteudo-Senatran/frota-de-veiculos-2024. Accessed: 2025-02-19

  57. [57]

    O., Schwartz, W

    Nascimento, V., Laroca, R., Ribeiro, R. O., Schwartz, W. R., and Menotti, D. (2024). Enhancing license plate super-resolution: A layout-aware and character-driven approach. Conference on Graphics, Patterns and Images (SIBGRAPI) , pages 1--6. doi:10.1109/SIBGRAPI62404.2024.10716303 https://doi.org/10.1109/SIBGRAPI62404.2024.10716303

  58. [58]

    E., Ribeiro , R

    Nascimento , V., Lima , G. E., Ribeiro , R. O., Schwartz , W. R., Laroca , R., and Menotti , D. (2025). Toward advancing license plate super-resolution in real-world scenarios: A dataset and benchmark. Journal of the Brazilian Computer Society , 1(31):435--449. doi:10.5753/jbcs.2025.5159 https://doi.org/10.5753/jbcs.2025.5159

  59. [59]

    Nyi Myo, N., Boonkong, A., Khampitak, K., and Hormdee, D. (2025). A two-point association tracking system incorporated with YOLOv11 for real-time visual tracking of laparoscopic surgical instruments. IEEE Access , 13:12225--12238. doi:10.1109/ACCESS.2025.3529710 https://doi.org/10.1109/ACCESS.2025.3529710

  60. [60]

    Ochal, M., Patacchiola, M., Vazquez, J., Storkey, A., and Wang, S. (2023). Few-shot learning with class imbalance. IEEE Transactions on Artificial Intelligence , 4(5):1348--1358. doi:10.1109/TAI.2023.3298303 https://doi.org/10.1109/TAI.2023.3298303

  61. [61]

    O., Laroca , R., Menotti , D., Fonseca , K

    Oliveira , I. O., Laroca , R., Menotti , D., Fonseca , K. V. O., and Minetto , R. (2021). Vehicle-Rear : A new dataset to explore feature fusion for vehicle identification using convolutional neural networks. IEEE Access , 9:101065--101077. doi:10.1109/ACCESS.2021.3097964 https://doi.org/10.1109/ACCESS.2021.3097964

  62. [62]

    Rao, Z., Yang, D., Chen, N., and Liu, J. (2024). License plate recognition system in unconstrained scenes via a new image correction scheme and improved CRNN . Expert Systems with Applications , 243:122878. doi:10.1016/j.eswa.2023.122878 https://doi.org/10.1016/j.eswa.2023.122878

  63. [63]

    R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra, D

    Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In 2017 IEEE International Conference on Computer Vision (ICCV) , pages 618--626. doi:10.1109/ICCV.2017.74 https://doi.org/10.1109/ICCV.2017.74

  64. [64]

    Shvai, N., Hasnat, A., Meicler, A., and Nakib, A. (2020). Accurate classification for automatic vehicle-type recognition based on ensemble classifiers. IEEE Transactions on Intelligent Transportation Systems , 21(3):1288--1297. doi:10.1109/TITS.2019.2906821 https://doi.org/10.1109/TITS.2019.2906821

  65. [65]

    Sochor, J., Herout, A., and Havel, J. (2016). BoxCars: 3D boxes as CNN input for improved fine-grained vehicle recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pages 3006--3015. doi:10.1109/CVPR.2016.328 https://doi.org/10.1109/CVPR.2016.328

  66. [66]

    Son, J.-W., Park, S.-B., and Kim, K.-J. (2007). A convolution kernel method for color recognition. In International Conference on Advanced Language Processing and Web Information Technology , pages 242--247. doi:10.1109/ALPIT.2007.28 https://doi.org/10.1109/ALPIT.2007.28

  67. [67]

    and Le, Q

    Tan, M. and Le, Q. (2021). EfficientNetV2 : Smaller models and faster training. In International Conf. on Machine Learning , pages 10096--10106

  68. [68]

    Ultralytics (2025). YOLOv11 . https://docs.ultralytics.com/models/yolo11/. Accessed: 2025-03-04

  69. [69]

    Wang, H., Peng, J., Zhao, Y., and Fu, X. (2020). Multi-path deep CNNs for fine-grained car recognition. IEEE Transactions on Vehicular Technology , 69(10):10484--10493. doi:10.1109/TVT.2020.3009162 https://doi.org/10.1109/TVT.2020.3009162

  70. [70]

    Wang, Y., Wang, C., Zheng, Y., Fu, H., and Ma, H. (2021). Transformer based neural network for fine-grained classification of vehicle color. In International Conference on Multimedia Information Processing and Retrieval (MIPR) , pages 118--124. doi:10.1109/MIPR51284.2021.00025 https://doi.org/10.1109/MIPR51284.2021.00025

  71. [71]

    E., Nascimento , V., Nascimento Jr

    Wojcik , L., Lima , G. E., Nascimento , V., Nascimento Jr. , E., Laroca , R., and Menotti , D. (2025). LPLC : A dataset for license plate legibility classification. Conference on Graphics, Patterns and Images (SIBGRAPI) , pages 1--6. doi:10.1109/SIBGRAPI67909.2025.11223367 https://doi.org/10.1109/SIBGRAPI67909.2025.11223367

  72. [72]

    Wolf, S., Loran, D., and Beyerer, J. (2024). Knowledge-distillation-based label smoothing for fine-grained open-set vehicle recognition. In IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW) , pages 330--340. doi:10.1109/WACVW60836.2024.00041 https://doi.org/10.1109/WACVW60836.2024.00041

  73. [73]

    Wu, W., QiSen, Z., and Mingjun, W. (2001). A method of vehicle classification using models and neural networks. In IEEE Vehicular Technology Conference , pages 3022--3026. doi:10.1109/VETECS.2001.944158 https://doi.org/10.1109/VETECS.2001.944158

  74. [74]

    Xu, Z., Yang, W., Meng, A., Lu, N., Huang, H., Ying, C., and Huang, L. (2018). Towards end-to-end license plate detection and recognition: A large dataset and baseline. In European Conference on Computer Vision (ECCV) . doi:10.1007/978-3-030-01261-8\_16 https://doi.org/10.1007/978-3-030-01261-8\_16

  75. [75]

    C., and Tang, X

    Yang, L., Luo, P., Loy, C. C., and Tang, X. (2015). A large-scale car dataset for fine-grained categorization and verification. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pages 3973--3981. doi:10.1109/CVPR.2015.7299023 https://doi.org/10.1109/CVPR.2015.7299023

  76. [76]

    Yu, Y., Liu, H., Fu, Y., Jia, W., Yu, J., and Yan, Z. (2022). Embedding pose information for multiview vehicle model recognition. IEEE Transactions on Circuits and Systems for Video Technology , 32(8):5467--5480. doi:10.1109/TCSVT.2022.3151116 https://doi.org/10.1109/TCSVT.2022.3151116

  77. [77]

    Yuan, Y., Zou, W., Zhao, Y., Wang, X., Hu, X., and Komodakis, N. (2017). A robust and efficient approach to license plate detection. IEEE Transactions on Image Processing , 26(3):1102--1114. doi:10.1109/TIP.2016.2631901 https://doi.org/10.1109/TIP.2016.2631901

  78. [78]

    Zhang, L., Wang, P., Li, H., Li, Z., Shen, C., and Zhang, Y. (2021). A robust attentional framework for license plate recognition in the wild. IEEE Transactions on Intelligent Transportation Systems , 22(11):6967--6976. doi:10.1109/TITS.2020.3000072 https://doi.org/10.1109/TITS.2020.3000072

  79. [79]

    Zhang, Q., Zhuo, L., Li, J., Zhang, J., Zhang, H., and Li, X. (2018). Vehicle color recognition using multiple-layer feature representations of lightweight convolutional neural network. Signal Processing , 147:146--153. doi:10.1016/j.sigpro.2018.01.021 https://doi.org/10.1016/j.sigpro.2018.01.021