arxiv: 2604.05271 · v1 · submitted 2026-04-07 · 💻 cs.CV

Recognition: no theorem link

Toward Unified Fine-Grained Vehicle Classification and Automatic License Plate Recognition

Gabriel E. Lima , Valfride Nascimento , Eduardo Santos , Eduil Nascimento Jr , Rayson Laroca , David Menotti

Authors on Pith no claims yet

Pith reviewed 2026-05-10 18:50 UTC · model grok-4.3

classification 💻 cs.CV

keywords fine-grained vehicle classificationautomatic license plate recognitionsurveillance datasetdeep learning benchmarkreal-world conditionsvehicle attributesALPRFGVC

0 comments

The pith

A new real-world dataset validates fine-grained vehicle attributes using license plate data and benchmarks their joint use with automatic recognition.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces UFPR-VeSV, a collection of 24,945 surveillance images showing 16,297 distinct vehicles under varied conditions including nighttime infrared and partial occlusions. It supplies annotations for 13 colors, 26 makes, 136 models, and 14 types, with all labels cross-checked against the visible license plate text and corners. Five deep learning models are tested to expose practical difficulties such as multicolored cars and platform-sharing models that look alike. The work also runs optical character recognition on the plates and examines whether feeding fine-grained classification results back into plate reading improves outcomes. This matters for traffic systems and investigations that need reliable vehicle identification when single cues fail.

Core claim

We introduce UFPR-VeSV, a dataset comprising 24,945 images of 16,297 unique vehicles with annotations for 13 colors, 26 makes, 136 models, and 14 types. Collected from the Military Police of Paraná surveillance system, the dataset captures diverse real-world conditions, including partial occlusions, nighttime infrared imaging, and varying lighting. All FGVC annotations were validated using license plate information, with text and corner annotations also being provided. A benchmark using five deep learning models further validated this, revealing specific challenges such as handling multicolored vehicles, infrared images, and distinguishing between vehicle models that share a common platform.

What carries the argument

The UFPR-VeSV dataset, whose fine-grained vehicle labels are cross-validated against license-plate text and corners extracted from the same surveillance images.

If this is right

Deep learning models must be trained to handle infrared frames and vehicles with multiple colors on the same body.
Models still struggle to separate vehicle variants built on identical platforms even when given large annotated sets.
Outputs from fine-grained classification can be combined with plate text to resolve cases where either cue alone is insufficient.
The dataset supports development of systems that operate under partial occlusion and changing illumination without controlled lighting.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Cross-checking visual attributes against plate data could be applied to improve training sets for other vehicle recognition tasks where ground truth is otherwise expensive to obtain.
An end-to-end network that predicts both attribute vector and plate string in one forward pass might reduce error propagation between the two tasks.
The identified failure modes suggest that specialized data augmentation for nighttime and occluded views would be a direct next step for practitioners.
Law-enforcement pipelines could use the joint outputs to flag inconsistencies between reported vehicle details and observed plates.

Load-bearing premise

License plate readings always supply correct and unambiguous ground truth for a vehicle's color, make, model, and type.

What would settle it

A collection of images in which the visible vehicle body and the information readable from its license plate systematically disagree, such as through plate swaps or misreads, would show whether the validation step holds.

Figures

Figures reproduced from arXiv: 2604.05271 by David Menotti, Eduardo Santos, Eduil Nascimento Jr, Gabriel E. Lima, Rayson Laroca, Valfride Nascimento.

**Figure 2.** Figure 2: Examples of images featuring multiple vehicles due to camera perspective. The background vehicle is highlighted with a green border, while the main vehicle is shadowed to enhance contrast. The dataset spans a wide temporal range, including both daytime and nighttime conditions. While timestamps are not available, images are categorized by the camera’s capture mode. Nighttime images, primarily captured in i… view at source ↗

**Figure 1.** Figure 1: Distribution of vehicles across the attributes of color (a), make (b), model (c) and type (d) in the UFPR-VeSV dataset. For better visualization, only the 30 most common vehicle models are displayed in (c), representing 63.7% of the total images. categorized as either front or rear view based on the visibility of the LP. As a result, the dataset contains 13,842 rear-view and 11,103 frontal-view images. The… view at source ↗

**Figure 4.** Figure 4: Example of LPs cropped from images captured under diverse conditions, showcasing variations in resolution, perspective, and image quality. The corresponding annotated LP text is shown below each image. Regarding ALPR, the UFPR-VeSV dataset includes annotations for both LP characters and corner coordinates [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 6.** Figure 6: Example of border standardization. (a) original image with a green border; (b) result after removing a 5-pixel margin from all sides. Finally, it is important to highlight that the images were not resized to uniform dimensions. As a result, the dataset retains a variety of image sizes, with widths ranging from 89 to 2,110 pixels and heights from 135 to 1,408 pixels. This approach prevents distortions that … view at source ↗

**Figure 7.** Figure 7: Examples of vehicles assigned to the “unknown” class for color, make, and model annotations: (a) a red Yamaha motorcycle; (b) a white Volkswagen truck; (c) a red Mercedes-Benz truck. In all cases, the main body of the vehicle is obstructed, rendering the identification of color, make, and model impossible. Classes with fewer than 25 samples were adjusted to reduce extreme class imbalance. Underrepresented … view at source ↗

**Figure 8.** Figure 8: Representative images from three public datasets and UFPR-VeSV. Our dataset features significantly more challenging scenarios, with vehicles captured from diverse viewpoints, environments, lighting conditions, image quality levels, and nighttime infrared imaging. vehicles captured from multiple viewpoints and under varied real-world conditions. A common limitation among existing datasets is the lack of sce… view at source ↗

**Figure 10.** Figure 10: Examples of misclassified vehicle models from different manufacturers. Original make and model are displayed below each image. In (a), a rear-view Boxer was misclassified as a Ducato, while in (b) a rear-view Ducato was misclassified as a Boxer. Aside from minor brand-specific markings, the vehicles have a very similar structure, making accurate differentiation challenging. In type recognition, misclass… view at source ↗

**Figure 9.** Figure 9: Examples of misclassified multicolored vehicle from different viewpoints. The predicted color is shown below each image. Depending on the camera angle and illumination, a single color can appear dominant, which causes the classifier to classify the vehicle based on that one color. For make recognition, the “others” class had the lowest performance (≈ 30%). This suggests that a superclass for less common ma… view at source ↗

**Figure 11.** Figure 11: Grad-CAM [Selvaraju et al., 2017] attention maps for a Volkswagen Parati. (a) Original image. (b) The make recognition map, focusing on the manufacturer’s badge. (c) The model recognition map, relying on other features, such as the headlights. This analysis reveals a key limitation for practical FGVC: isolated models are insufficient, as they can produce logically inconsistent results. Methods must, the… view at source ↗

**Figure 12.** Figure 12: Example of misrecognized LPs. For each image, the ground-truth label is shown above the model’s prediction, with incorrectly recognized characters highlighted in red. The failure cases include severely degraded characters (a, h, i), illumination-induced obstructions (b), low-contrast characters (c, j), blurring (d, g), and physically deformed LPs (e, f). 6.2 Joint System Analysis The previous section sho… view at source ↗

**Figure 13.** Figure 13: Examples of ALPR failures where the LP is illegible due to (a) light glare and (b) occlusion. Despite these failures, our FGVC classifier acts as a fail-safe, correctly identifying the type, make, and model for both vehicles. Color recognition is correct for (b), while the infrared image (a) is classified as “unknown” due to the absence of color information. Finally, we acknowledge this initial analysis h… view at source ↗

read the original abstract

Extracting vehicle information from surveillance images is essential for intelligent transportation systems, enabling applications such as traffic monitoring and criminal investigations. While Automatic License Plate Recognition (ALPR) is widely used, Fine-Grained Vehicle Classification (FGVC) offers a complementary approach by identifying vehicles based on attributes such as color, make, model, and type. Although there have been advances in this field, existing studies often assume well-controlled conditions, explore limited attributes, and overlook FGVC integration with ALPR. To address these gaps, we introduce UFPR-VeSV, a dataset comprising 24,945 images of 16,297 unique vehicles with annotations for 13 colors, 26 makes, 136 models, and 14 types. Collected from the Military Police of Paran\'a (Brazil) surveillance system, the dataset captures diverse real-world conditions, including partial occlusions, nighttime infrared imaging, and varying lighting. All FGVC annotations were validated using license plate information, with text and corner annotations also being provided. A qualitative and quantitative comparison with established datasets confirmed the challenging nature of our dataset. A benchmark using five deep learning models further validated this, revealing specific challenges such as handling multicolored vehicles, infrared images, and distinguishing between vehicle models that share a common platform. Additionally, we apply two optical character recognition models to license plate recognition and explore the joint use of FGVC and ALPR. The results highlight the potential of integrating these complementary tasks for real-world applications. The UFPR-VeSV dataset is publicly available at: https://github.com/Lima001/UFPR-VeSV-Dataset.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a straightforward dataset release with real surveillance images and standard benchmarks, but the plate-based label validation lacks any reported accuracy checks.

read the letter

The core thing to know is that the authors release UFPR-VeSV, a collection of nearly 25,000 images from Brazilian police cameras covering 16k unique vehicles. It supplies annotations for 13 colors, 26 makes, 136 models, and 14 types, plus plate text and corners, all gathered under real conditions including infrared and occlusions. They also run five common deep models to highlight failure cases like multicolored vehicles and platform-sharing models, and they test joint FGVC plus ALPR with two OCR approaches. The data is public, which is the main value here.

Referee Report

1 major / 2 minor

Summary. The manuscript introduces the UFPR-VeSV dataset comprising 24,945 images of 16,297 unique vehicles captured from Brazilian police surveillance cameras under diverse real-world conditions including occlusions, nighttime infrared, and varying lighting. It provides FGVC annotations for 13 colors, 26 makes, 136 models, and 14 types, plus license plate text and corner annotations, with the claim that all FGVC labels were validated using license plate information from police records. The work benchmarks five deep learning models on FGVC, identifies concrete challenges (multicolored vehicles, infrared images, platform-sharing models), compares the dataset qualitatively/quantitatively to prior collections, and explores joint FGVC-ALPR using two OCR models, with the dataset released publicly.

Significance. If the annotations are verifiably accurate, the dataset offers a useful public resource for real-world vehicle recognition research by combining scale, attribute richness, and challenging conditions not fully covered in existing collections. The explicit identification of model failure modes and the preliminary joint FGVC-ALPR experiments provide actionable insights for intelligent transportation applications.

major comments (1)

[Dataset description and annotation validation] Dataset description and annotation validation (abstract and corresponding methods section): the central claim that 'All FGVC annotations were validated using license plate information' is load-bearing for the reliability of the reported benchmarks and the identified challenges (e.g., distinguishing platform-sharing models). However, no quantitative validation error rate, description of the matching procedure (e.g., registry lookup under OCR failures or plate swaps), or independent visual re-annotation cross-check is supplied. This omission leaves the ground-truth quality unquantified and risks systematic label noise affecting the benchmark conclusions.

minor comments (2)

[Abstract] Abstract: limited detail on training protocols, exact performance numbers, and statistical significance of the joint FGVC-ALPR experiments reduces the ability to assess results at a glance; adding key metrics would strengthen the summary.
[Comparison section] Comparison with established datasets: ensure the qualitative/quantitative comparison section explicitly cites prior FGVC and ALPR datasets and tabulates key differences (e.g., attribute coverage, condition diversity) for clarity.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback, which helps strengthen the presentation of our dataset and its validation. We address the major comment below and will revise the manuscript to provide the requested details.

read point-by-point responses

Referee: Dataset description and annotation validation (abstract and corresponding methods section): the central claim that 'All FGVC annotations were validated using license plate information' is load-bearing for the reliability of the reported benchmarks and the identified challenges (e.g., distinguishing platform-sharing models). However, no quantitative validation error rate, description of the matching procedure (e.g., registry lookup under OCR failures or plate swaps), or independent visual re-annotation cross-check is supplied. This omission leaves the ground-truth quality unquantified and risks systematic label noise affecting the benchmark conclusions.

Authors: We agree that the current manuscript provides insufficient detail on the annotation validation process. In the revised version, we will expand the methods section with a new subsection describing the validation procedure in full. This will include: (i) the exact matching workflow between image-derived license plates and the police registry records, (ii) explicit handling of OCR failures and potential plate swaps or mismatches, and (iii) the results of an independent visual re-annotation cross-check performed on a random subset of samples. We will also report the quantitative validation error rate (number of discrepancies found and how they were resolved). These additions will allow readers to assess ground-truth reliability directly and mitigate concerns about label noise. revision: yes

Circularity Check

0 steps flagged

No circularity: dataset release and standard benchmarks are self-contained

full rationale

The paper introduces the UFPR-VeSV dataset with FGVC annotations (validated via external license plate records from police sources) and runs benchmarks on five off-the-shelf deep learning models. No equations, parameter fitting, or predictions appear in the provided text. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes. The annotation validation step is a procedural claim about data collection, not a derivation that reduces to its own inputs by construction. This matches the default expectation of no significant circularity for a dataset-plus-benchmark paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are introduced; the contribution rests on standard image collection, annotation via existing license plate data, and off-the-shelf deep learning models.

pith-pipeline@v0.9.0 · 5607 in / 1129 out tokens · 33299 ms · 2026-05-10T18:50:15.539743+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

79 extracted references · 71 canonical work pages · 1 internal anchor

[1]

write newline

" write newline "" before.all 'output.state := FUNCTION fin.entry add.period write newline FUNCTION new.sentence output.state after.block = 'skip output.state before.all = 'skip after.sentence 'output.state := if if FUNCTION not #0 #1 if FUNCTION and 'skip pop #0 if FUNCTION or pop #1 'skip if FUNCTION new.block.checkb empty swap empty and 'skip 'new.bloc...
[2]

and Barshooi, A

Amirkhani, A. and Barshooi, A. H. (2023). Deepcar 5.0: Vehicle make and model recognition under challenging conditions. IEEE Transactions on Intelligent Transportation Systems , 24(1):541--553. doi:10.1109/TITS.2022.3212921 https://doi.org/10.1109/TITS.2022.3212921

work page doi:10.1109/tits.2022.3212921 2023
[3]

Baek, N., Park, S.-M., Kim, K.-J., and Park, S.-B. (2007). Vehicle color classification based on the support vector machine method. In International Conference on Intelligent Computing , pages 1133--1139. doi:10.1007/978-3-540-74282-1\_127 https://doi.org/10.1007/978-3-540-74282-1\_127

work page doi:10.1007/978-3-540-74282-1 2007
[4]

and Suresh, S

Basak, S. and Suresh, S. (2024). Vehicle detection and type classification in low resolution congested traffic scenes using image super resolution. Multimedia Tools and Applications , 83(8):21825--21847. doi:10.1007/s11042-023-16337-2 https://doi.org/10.1007/s11042-023-16337-2

work page doi:10.1007/s11042-023-16337-2 2024
[5]

and Atienza, R

Bautista, D. and Atienza, R. (2022). Scene text recognition with permuted autoregressive sequence models. In European Conference on Computer Vision (ECCV) , pages 178--196. doi:10.1007/978-3-031-19815-1\_11 https://doi.org/10.1007/978-3-031-19815-1\_11

work page doi:10.1007/978-3-031-19815-1 2022
[6]

Caruana, R. (1997). Multitask learning. Machine learning , 28:41--75. doi:10.1023/A:1007379606734 https://doi.org/10.1023/A:1007379606734

work page doi:10.1023/a:1007379606734 1997
[7]

Celestino, M. (2021). 10 marcas que mais venderam carros na década. https://www.webmotors.com.br/wm1/noticias/10-marcas-que-mais-venderam-carros-na-decada. Accessed: 2025-02-19

2021
[8]

Chen, P., Bai, X., and Liu, W. (2014). Vehicle color recognition on urban road by feature context. IEEE Transactions on Intelligent Transportation Systems , 15(5):2340--2346. doi:10.1109/TITS.2014.2308897 https://doi.org/10.1109/TITS.2014.2308897

work page doi:10.1109/tits.2014.2308897 2014
[9]

Trevor Hastie, Andrea Montanari, Saharon Rosset, and Ryan J

Cubuk, E. D., Zoph, B., Shlens, J., and Le, Q. V. (2020). Randaugment: Practical automated data augmentation with a reduced search space. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) , pages 3008--3017. doi:10.1109/CVPRW50498.2020.00359 https://doi.org/10.1109/CVPRW50498.2020.00359

work page doi:10.1109/cvprw50498.2020.00359 2020
[10]

Deng, J., Guo, J., Ververas, E., Kotsia, I., and Zafeiriou, S. (2020). RetinaFace : Single-shot multi-level face localisation in the wild. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages 5202--5211. doi:10.1109/CVPR42600.2020.00525 https://doi.org/10.1109/CVPR42600.2020.00525

work page doi:10.1109/cvpr42600.2020.00525 2020
[11]

Deng, J., Krause, J., and Fei-Fei, L. (2013). Fine-grained crowdsourcing for fine-grained recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . doi:10.1109/CVPR.2013.81 https://doi.org/10.1109/CVPR.2013.81

work page doi:10.1109/cvpr.2013.81 2013
[12]

Dong, Z., Wu, Y., Pei, M., and Jia, Y. (2015). Vehicle type classification using a semisupervised convolutional neural network. IEEE Transactions on Intelligent Transportation Systems , 16(4):2247--2256. doi:10.1109/TITS.2015.2402438 https://doi.org/10.1109/TITS.2015.2402438

work page doi:10.1109/tits.2015.2402438 2015
[13]

Dosovitskiy , A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR) , pages 1--22

2021
[14]

Du, Y., Chen, Z., Su, Y., Jia, C., and Jiang, Y.-G. (2025). Instruction-guided scene text recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence , pages 1--16. doi:10.1109/TPAMI.2025.3525526 https://doi.org/10.1109/TPAMI.2025.3525526

work page doi:10.1109/tpami.2025.3525526 2025
[15]

Dule, E., G\" o kmen, M., and Berato g lu, M. S. (2010). A convenient feature vector construction for vehicle color recognition. In WSEAS International Conference on Neural Networks, Evolutionary Computing and Fuzzy systems , page 250–255. doi:10.5555/1863431.1863473 https://doi.org/10.5555/1863431.1863473

work page doi:10.5555/1863431.1863473 2010
[16]

and Zhao, W

Fan, X. and Zhao, W. (2022). Improving robustness of license plates automatic recognition in natural scenes. IEEE Transactions on Intelligent Transportation Systems , 23(10):18845--18854. doi:10.1109/TITS.2022.3151475 https://doi.org/10.1109/TITS.2022.3151475

work page doi:10.1109/tits.2022.3151475 2022
[17]

and Croquer, G

Farias, V. and Croquer, G. (2023). Por que o carro colorido sumiu? 67\ https://g1.globo.com/economia/noticia/2023/08/20/por-que-o-carro-colorido-sumiu-67percent-dos-veiculos-no-brasil-sao-brancos-pretos-ou-cinzas.ghtml. Accessed: 2025-02-19

2023
[18]

M., Worrall, A

Ferryman, J. M., Worrall, A. D., Sullivan, G. D., and Baker, K. D. (1995). A generic deformable model for vehicle recognition. In British Machine Vision Conference (BMVC) , page 127–136. doi:10.5555/236190.236202 https://doi.org/10.5555/236190.236202

work page doi:10.5555/236190.236202 1995
[19]

Fu, H., Ma, H., Wang, G., Zhang, X., and Zhang, Y. (2020). MCFF-CNN : Multiscale comprehensive feature fusion convolutional neural network for vehicle color recognition based on residual learning. Neurocomputing , 395:178--187. doi:10.1016/j.neucom.2018.02.111 https://doi.org/10.1016/j.neucom.2018.02.111

work page doi:10.1016/j.neucom.2018.02.111 2020
[20]

Selective classification for deep neural networks.Advances in Neural Information Processing Systems, 2017

Geifman, Y. and El-Yaniv, R. (2017). Selective classification for deep neural networks. In International Conference on Neural Information Processing Systems (NeurIPS) , page 4885–4894. doi:10.5555/3295222.3295241 https://doi.org/10.5555/3295222.3295241

work page doi:10.5555/3295222.3295241 2017
[21]

https://doi.org/10.1109/fg.2018.00021

Gon c alves , G. R., Diniz , M. A., Laroca , R., Menotti , D., and Schwartz , W. R. (2018). Real-time automatic license plate recognition through deep multi-task networks. In Conference on Graphics, Patterns and Images (SIBGRAPI) , pages 110--117. doi:10.1109/SIBGRAPI.2018.00021 https://doi.org/10.1109/SIBGRAPI.2018.00021

work page doi:10.1109/sibgrapi.2018.00021 2018
[22]

Guo, C., Pleiss, G., Sun, Y., and Weinberger, K. Q. (2017). On calibration of modern neural networks. In Precup, D. and Teh, Y. W., editors, Proceedings of the 34th International Conference on Machine Learning , volume 70 of Proceedings of Machine Learning Research , pages 1321--1330. PMLR. doi:10.5555/3305381.3305518 https://doi.org/10.5555/3305381.3305518

work page doi:10.5555/3305381.3305518 2017
[23]

Han, K., Xiao, A., Wu, E., Guo, J., XU, C., and Wang, Y. (2021). Transformer in transformer. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W., editors, Advances in Neural Information Processing Systems , volume 34, pages 15908--15919. Curran Associates, Inc. doi:10.5555/3540261.3541478 https://doi.org/10.5555/3540261.3541478

work page doi:10.5555/3540261.3541478 2021
[24]

M., and Tahir, M

Hassan, A., Ali, M., Durrani, N. M., and Tahir, M. A. (2021). An empirical analysis of deep learning architectures for vehicle make and model recognition. IEEE Access , 9:91487--91499. doi:10.1109/ACCESS.2021.3090766 https://doi.org/10.1109/ACCESS.2021.3090766

work page doi:10.1109/access.2021.3090766 2021
[25]

He, C., Wang, D., Cai, Z., Zeng, J., and Fu, F. (2024a). A vehicle matching algorithm by maximizing travel time probability based on automatic license plate recognition data. IEEE Transactions on Intelligent Transportation Systems , 25(8):9103--9114. doi:10.1109/TITS.2024.3358625 https://doi.org/10.1109/TITS.2024.3358625

work page doi:10.1109/tits.2024.3358625 2024
[26]

He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pages 770--778. doi:10.1109/CVPR.2016.90 https://doi.org/10.1109/CVPR.2016.90

work page doi:10.1109/cvpr.2016.90 2016
[27]

He, L., Zhou, Y., Liu, L., and Ma, J. (2024b). Research and application of YOLOv11 -based object segmentation in intelligent recognition at construction sites. Buildings , 14(12). doi:10.3390/buildings14123777 https://doi.org/10.3390/buildings14123777

work page doi:10.3390/buildings14123777
[28]

Howard, A., Sandler, M., Chen, B., Wang, W., Chen, L.-C., Tan, M., Chu, G., Vasudevan, V., Zhu, Y., Pang, R., Adam, H., and Le, Q. (2019). Searching for MobileNetV3 . In IEEE/CVF International Conference on Computer Vision (ICCV) , pages 1314--1324. doi:10.1109/ICCV.2019.00140 https://doi.org/10.1109/ICCV.2019.00140

work page doi:10.1109/iccv.2019.00140 2019
[29]

Hsu, G.-S., Chen, J.-C., and Chung, Y.-Z. (2013). Application-oriented license plate recognition. IEEE Transactions on Vehicular Technology , 62(2):552--561. doi:10.1109/TVT.2012.2226218 https://doi.org/10.1109/TVT.2012.2226218

work page doi:10.1109/tvt.2012.2226218 2013
[30]

Hu, B., Lai, J.-H., and Guo, C.-C. (2017). Location-aware fine-grained vehicle type recognition using multi-task deep networks. Neurocomputing , 243:60--68. doi:10.1016/j.neucom.2017.02.085 https://doi.org/10.1016/j.neucom.2017.02.085

work page doi:10.1016/j.neucom.2017.02.085 2017
[31]

Hu, C., Bai, X., Qi, L., Chen, P., Xue, G., and Mei, L. (2015). Vehicle color recognition with spatial pyramid deep learning. IEEE Transactions on Intelligent Transportation Systems , 16(5):2925--2934. doi:10.1109/TITS.2015.2430892 https://doi.org/10.1109/TITS.2015.2430892

work page doi:10.1109/tits.2015.2430892 2015
[32]

Hu, M., Bai, L., Fan, J., Zhao, S., and Chen, E. (2023). Vehicle color recognition based on smooth modulation neural network with multi-scale feature fusion. Frontiers of Computer Science , 17(3):173321. doi:10.1007/s11704-022-1389-x https://doi.org/10.1007/s11704-022-1389-x

work page doi:10.1007/s11704-022-1389-x 2023
[33]

C., and Tang, X

Huang, C., Li, Y., Loy, C. C., and Tang, X. (2016). Learning deep representation for imbalanced classification. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pages 5375--5384. doi:10.1109/CVPR.2016.580 https://doi.org/10.1109/CVPR.2016.580

work page doi:10.1109/cvpr.2016.580 2016
[34]

Huang, G., Liu, Z., van der Maaten, L., and Weinberger, K. Q. (2017). Densely connected convolutional networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . doi:10.1109/CVPR.2017.243 https://doi.org/10.1109/CVPR.2017.243

work page doi:10.1109/cvpr.2017.243 2017
[35]

Jolly, M.-P., Lakshmanan, S., and Jain, A. (1996). Vehicle segmentation and classification using deformable templates. IEEE Transactions on Pattern Analysis and Machine Intelligence , 18(3):293--308. doi:10.1109/34.485557 https://doi.org/10.1109/34.485557

work page doi:10.1109/34.485557 1996
[36]

YOLOv11: An Overview of the Key Architectural Enhancements

Khanam, R. and Hussain, M. (2024). YOLOv11 : An overview of the key architectural enhancements. arXiv preprint . doi:10.48550/arXiv.2410.17725 https://doi.org/10.48550/arXiv.2410.17725

work page internal anchor Pith review doi:10.48550/arxiv.2410.17725 2024
[37]

Krause, J., Deng, J., Stark, M., and Fei-Fei, L. (2013a). Collecting a large-scale dataset of fine-grained cars. In Second Workshop on Fine-Grained Visual Categorisation (FGVC), in conjunction with CVPR . available at https://ai.stanford.edu/ jkrause/papers/fgvc13.pdf
[38]

Krause, J., Stark, M., Deng, J., and Fei-Fei, L. (2013b). 3d object representations for fine-grained categorization. In 2013 IEEE International Conference on Computer Vision Workshops , pages 554--561. doi:10.1109/ICCVW.2013.77 https://doi.org/10.1109/ICCVW.2013.77

work page doi:10.1109/iccvw.2013.77 2013
[39]

Kuhn, D. M. and Moreira, V. P. (2021). BRCars : a dataset for fine-grained classification of car images. In 2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI) , pages 231--238. doi:10.1109/SIBGRAPI54419.2021.00039 https://doi.org/10.1109/SIBGRAPI54419.2021.00039

work page doi:10.1109/sibgrapi54419.2021.00039 2021
[40]

Lai, A., Fung, G., and Yung, N. (2001). Vehicle type classification from visual-based dimension estimation. In IEEE Intelligent Transportation Systems Conference (ITSC) , pages 201--206. doi:10.1109/ITSC.2001.948656 https://doi.org/10.1109/ITSC.2001.948656

work page doi:10.1109/itsc.2001.948656 2001
[41]

B., Zanlorensi , L

Laroca , R., Araujo , A. B., Zanlorensi , L. A., De Almeida , E. C., and Menotti , D. (2021). Towards image-based automatic meter reading in unconstrained scenarios: A robust and efficient approach. IEEE Access , 9:67569--67584. doi:10.1109/ACCESS.2021.3077415 https://doi.org/10.1109/ACCESS.2021.3077415

work page doi:10.1109/access.2021.3077415 2021
[42]

V., Lucio , D

Laroca , R., Cardoso , E. V., Lucio , D. R., Estevam , V., and Menotti , D. (2022). On the cross-dataset generalization in license plate recognition. In International Conference on Computer Vision Theory and Applications (VISAPP) , pages 166--178. doi:10.5220/0010846800003124 https://doi.org/10.5220/0010846800003124

work page doi:10.5220/0010846800003124 2022
[43]

Laroca , R., Estevam , V., Britto Jr. , A. S., Minetto , R., and Menotti , D. (2023a). Do we train on test data? T he impact of near-duplicates on license plate recognition. In International Joint Conference on Neural Networks (IJCNN) , pages 1--8. doi:10.1109/IJCNN54540.2023.10191584 https://doi.org/10.1109/IJCNN54540.2023.10191584

work page doi:10.1109/ijcnn54540.2023.10191584 2023
[44]

Laroca , R., Estevam , V., Moreira , G. J. P., Minetto , R., and Menotti , D. (2025). Advancing multinational license plate recognition through synthetic and real data fusion: A comprehensive evaluation. IET Intelligent Transport Systems , 19(1):e70086. doi:10.1049/itr2.70086 https://doi.org/10.1049/itr2.70086

work page doi:10.1049/itr2.70086 2025
[45]

A Robust Real-Time Automatic License Plate Recognition Based on the

Laroca , R., Severo , E., Zanlorensi , L. A., Oliveira , L. S., Gon c alves , G. R., Schwartz , W. R., and Menotti , D. (2018). A robust real-time automatic license plate recognition based on the YOLO detector. In International Joint Conference on Neural Networks (IJCNN) , pages 1--10. doi:10.1109/IJCNN.2018.8489629 https://doi.org/10.1109/IJCNN.2018.8489629

work page doi:10.1109/ijcnn.2018.8489629 2018
[46]

A., Estevam , V., Minetto , R., and Menotti , D

Laroca , R., Zanlorensi , L. A., Estevam , V., Minetto , R., and Menotti , D. (2023b). Leveraging model fusion for improved license plate recognition. In Iberoamerican Congress on Pattern Recognition (CIARP) , pages 60--75. doi:10.1007/978-3-031-49249-5\_5 https://doi.org/10.1007/978-3-031-49249-5\_5

work page doi:10.1007/978-3-031-49249-5
[47]

O., Schwartz, W

Lima , G. E., Laroca , R., Santos , E., Nascimento Jr. , E., and Menotti , D. (2024). Toward enhancing vehicle color recognition in adverse conditions: A dataset and benchmark. In Conference on Graphics, Patterns and Images (SIBGRAPI) , pages 1--6. doi:10.1109/SIBGRAPI62404.2024.10716307 https://doi.org/10.1109/SIBGRAPI62404.2024.10716307

work page doi:10.1109/sibgrapi62404.2024.10716307 2024
[48]

Liu, Q., Chen, S.-L., Chen, Y.-X., and Yin, X.-C. (2024). Improving license plate recognition via diverse stylistic plate generation. Pattern Recognition Letters , 183:117--124. doi:10.1016/j.patrec.2024.05.005 https://doi.org/10.1016/j.patrec.2024.05.005

work page doi:10.1016/j.patrec.2024.05.005 2024
[49]

Liu , Y.-Y., Liu, Q., Chen, S.-L., Chen, F., and Yin, X.-C. (2024). Irregular license plate recognition via global information integration. In International Conference on Multimedia Modeling , pages 325--339. doi:10.1007/978-3-031-53308-2\_24 https://doi.org/10.1007/978-3-031-53308-2\_24

work page doi:10.1007/978-3-031-53308-2 2024
[50]

Liu, Z., Hu, H., Lin, Y., Yao, Z., Xie, Z., Wei, Y., Ning, J., Cao, Y., Zhang, Z., Dong, L., Wei, F., and Guo, B. (2022). Swin transformer v2: Scaling up capacity and resolution. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages 11999--12009. doi:10.1109/CVPR52688.2022.01170 https://doi.org/10.1109/CVPR52688.2022.01170

work page doi:10.1109/cvpr52688.2022.01170 2022
[51]

Lu, L., Cai, Y., Huang, H., and Wang, P. (2023). An efficient fine-grained vehicle recognition method based on part-level feature optimization. Neurocomputing , 536:40--49. doi:10.1016/j.neucom.2023.03.035 https://doi.org/10.1016/j.neucom.2023.03.035

work page doi:10.1016/j.neucom.2023.03.035 2023
[52]

Fully Dynamic Maximal Independent Set with Polylogarithmic Update Time , booktitle =

Lucio, D. R., Laroca, R., Zanlorensi, L. A., Moreira, G., and Menotti, D. (2019). Simultaneous iris and periocular region detection using coarse annotations. In Conference on Graphics, Patterns and Images (SIBGRAPI) , pages 178--185. doi:10.1109/SIBGRAPI.2019.00032 https://doi.org/10.1109/SIBGRAPI.2019.00032

work page doi:10.1109/sibgrapi.2019.00032 2019
[53]

Luo, R., Song, Y., Ye, L., and Su, R. (2024). Dense-tnt: Efficient vehicle type classification neural network using satellite imagery. Sensors , 24(23). doi:10.3390/s24237662 https://doi.org/10.3390/s24237662

work page doi:10.3390/s24237662 2024
[54]

and Grimson, W

Ma, X. and Grimson, W. (2005). Edge-based rich representation for vehicle classification. In IEEE International Conference on Computer Vision (ICCV) , pages 1185--1192. doi:10.1109/ICCV.2005.80 https://doi.org/10.1109/ICCV.2005.80

work page doi:10.1109/iccv.2005.80 2005
[55]

Minderer, M., Djolonga, J., Romijnders, R., Hubis, F., Zhai, X., Houlsby, N., Tran, D., and Lucic, M. (2021). Revisiting the calibration of modern neural networks. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W., editors, Advances in Neural Information Processing Systems , volume 34, pages 15682--15694. Curran Associates, Inc. ...

work page doi:10.5555/3540261.3541461 2021
[56]

Frota nacional (junho de 2024)

Ministério dos Transportes (2024). Frota nacional (junho de 2024). https://www.gov.br/transportes/pt-br/assuntos/transito/conteudo-Senatran/frota-de-veiculos-2024. Accessed: 2025-02-19

2024
[57]

O., Schwartz, W

Nascimento, V., Laroca, R., Ribeiro, R. O., Schwartz, W. R., and Menotti, D. (2024). Enhancing license plate super-resolution: A layout-aware and character-driven approach. Conference on Graphics, Patterns and Images (SIBGRAPI) , pages 1--6. doi:10.1109/SIBGRAPI62404.2024.10716303 https://doi.org/10.1109/SIBGRAPI62404.2024.10716303

work page doi:10.1109/sibgrapi62404.2024.10716303 2024
[58]

E., Ribeiro , R

Nascimento , V., Lima , G. E., Ribeiro , R. O., Schwartz , W. R., Laroca , R., and Menotti , D. (2025). Toward advancing license plate super-resolution in real-world scenarios: A dataset and benchmark. Journal of the Brazilian Computer Society , 1(31):435--449. doi:10.5753/jbcs.2025.5159 https://doi.org/10.5753/jbcs.2025.5159

work page doi:10.5753/jbcs.2025.5159 2025
[59]

Nyi Myo, N., Boonkong, A., Khampitak, K., and Hormdee, D. (2025). A two-point association tracking system incorporated with YOLOv11 for real-time visual tracking of laparoscopic surgical instruments. IEEE Access , 13:12225--12238. doi:10.1109/ACCESS.2025.3529710 https://doi.org/10.1109/ACCESS.2025.3529710

work page doi:10.1109/access.2025.3529710 2025
[60]

Ochal, M., Patacchiola, M., Vazquez, J., Storkey, A., and Wang, S. (2023). Few-shot learning with class imbalance. IEEE Transactions on Artificial Intelligence , 4(5):1348--1358. doi:10.1109/TAI.2023.3298303 https://doi.org/10.1109/TAI.2023.3298303

work page doi:10.1109/tai.2023.3298303 2023
[61]

O., Laroca , R., Menotti , D., Fonseca , K

Oliveira , I. O., Laroca , R., Menotti , D., Fonseca , K. V. O., and Minetto , R. (2021). Vehicle-Rear : A new dataset to explore feature fusion for vehicle identification using convolutional neural networks. IEEE Access , 9:101065--101077. doi:10.1109/ACCESS.2021.3097964 https://doi.org/10.1109/ACCESS.2021.3097964

work page doi:10.1109/access.2021.3097964 2021
[62]

Rao, Z., Yang, D., Chen, N., and Liu, J. (2024). License plate recognition system in unconstrained scenes via a new image correction scheme and improved CRNN . Expert Systems with Applications , 243:122878. doi:10.1016/j.eswa.2023.122878 https://doi.org/10.1016/j.eswa.2023.122878

work page doi:10.1016/j.eswa.2023.122878 2024
[63]

R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra, D

Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In 2017 IEEE International Conference on Computer Vision (ICCV) , pages 618--626. doi:10.1109/ICCV.2017.74 https://doi.org/10.1109/ICCV.2017.74

work page doi:10.1109/iccv.2017.74 2017
[64]

Shvai, N., Hasnat, A., Meicler, A., and Nakib, A. (2020). Accurate classification for automatic vehicle-type recognition based on ensemble classifiers. IEEE Transactions on Intelligent Transportation Systems , 21(3):1288--1297. doi:10.1109/TITS.2019.2906821 https://doi.org/10.1109/TITS.2019.2906821

work page doi:10.1109/tits.2019.2906821 2020
[65]

Sochor, J., Herout, A., and Havel, J. (2016). BoxCars: 3D boxes as CNN input for improved fine-grained vehicle recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pages 3006--3015. doi:10.1109/CVPR.2016.328 https://doi.org/10.1109/CVPR.2016.328

work page doi:10.1109/cvpr.2016.328 2016
[66]

Son, J.-W., Park, S.-B., and Kim, K.-J. (2007). A convolution kernel method for color recognition. In International Conference on Advanced Language Processing and Web Information Technology , pages 242--247. doi:10.1109/ALPIT.2007.28 https://doi.org/10.1109/ALPIT.2007.28

work page doi:10.1109/alpit.2007.28 2007
[67]

and Le, Q

Tan, M. and Le, Q. (2021). EfficientNetV2 : Smaller models and faster training. In International Conf. on Machine Learning , pages 10096--10106

2021
[68]

Ultralytics (2025). YOLOv11 . https://docs.ultralytics.com/models/yolo11/. Accessed: 2025-03-04

2025
[69]

Wang, H., Peng, J., Zhao, Y., and Fu, X. (2020). Multi-path deep CNNs for fine-grained car recognition. IEEE Transactions on Vehicular Technology , 69(10):10484--10493. doi:10.1109/TVT.2020.3009162 https://doi.org/10.1109/TVT.2020.3009162

work page doi:10.1109/tvt.2020.3009162 2020
[70]

Wang, Y., Wang, C., Zheng, Y., Fu, H., and Ma, H. (2021). Transformer based neural network for fine-grained classification of vehicle color. In International Conference on Multimedia Information Processing and Retrieval (MIPR) , pages 118--124. doi:10.1109/MIPR51284.2021.00025 https://doi.org/10.1109/MIPR51284.2021.00025

work page doi:10.1109/mipr51284.2021.00025 2021
[71]

E., Nascimento , V., Nascimento Jr

Wojcik , L., Lima , G. E., Nascimento , V., Nascimento Jr. , E., Laroca , R., and Menotti , D. (2025). LPLC : A dataset for license plate legibility classification. Conference on Graphics, Patterns and Images (SIBGRAPI) , pages 1--6. doi:10.1109/SIBGRAPI67909.2025.11223367 https://doi.org/10.1109/SIBGRAPI67909.2025.11223367

work page doi:10.1109/sibgrapi67909.2025.11223367 2025
[72]

Wolf, S., Loran, D., and Beyerer, J. (2024). Knowledge-distillation-based label smoothing for fine-grained open-set vehicle recognition. In IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW) , pages 330--340. doi:10.1109/WACVW60836.2024.00041 https://doi.org/10.1109/WACVW60836.2024.00041

work page doi:10.1109/wacvw60836.2024.00041 2024
[73]

Wu, W., QiSen, Z., and Mingjun, W. (2001). A method of vehicle classification using models and neural networks. In IEEE Vehicular Technology Conference , pages 3022--3026. doi:10.1109/VETECS.2001.944158 https://doi.org/10.1109/VETECS.2001.944158

work page doi:10.1109/vetecs.2001.944158 2001
[74]

Xu, Z., Yang, W., Meng, A., Lu, N., Huang, H., Ying, C., and Huang, L. (2018). Towards end-to-end license plate detection and recognition: A large dataset and baseline. In European Conference on Computer Vision (ECCV) . doi:10.1007/978-3-030-01261-8\_16 https://doi.org/10.1007/978-3-030-01261-8\_16

work page doi:10.1007/978-3-030-01261-8 2018
[75]

C., and Tang, X

Yang, L., Luo, P., Loy, C. C., and Tang, X. (2015). A large-scale car dataset for fine-grained categorization and verification. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pages 3973--3981. doi:10.1109/CVPR.2015.7299023 https://doi.org/10.1109/CVPR.2015.7299023

work page doi:10.1109/cvpr.2015.7299023 2015
[76]

Yu, Y., Liu, H., Fu, Y., Jia, W., Yu, J., and Yan, Z. (2022). Embedding pose information for multiview vehicle model recognition. IEEE Transactions on Circuits and Systems for Video Technology , 32(8):5467--5480. doi:10.1109/TCSVT.2022.3151116 https://doi.org/10.1109/TCSVT.2022.3151116

work page doi:10.1109/tcsvt.2022.3151116 2022
[77]

Yuan, Y., Zou, W., Zhao, Y., Wang, X., Hu, X., and Komodakis, N. (2017). A robust and efficient approach to license plate detection. IEEE Transactions on Image Processing , 26(3):1102--1114. doi:10.1109/TIP.2016.2631901 https://doi.org/10.1109/TIP.2016.2631901

work page doi:10.1109/tip.2016.2631901 2017
[78]

Zhang, L., Wang, P., Li, H., Li, Z., Shen, C., and Zhang, Y. (2021). A robust attentional framework for license plate recognition in the wild. IEEE Transactions on Intelligent Transportation Systems , 22(11):6967--6976. doi:10.1109/TITS.2020.3000072 https://doi.org/10.1109/TITS.2020.3000072

work page doi:10.1109/tits.2020.3000072 2021
[79]

Zhang, Q., Zhuo, L., Li, J., Zhang, J., Zhang, H., and Li, X. (2018). Vehicle color recognition using multiple-layer feature representations of lightweight convolutional neural network. Signal Processing , 147:146--153. doi:10.1016/j.sigpro.2018.01.021 https://doi.org/10.1016/j.sigpro.2018.01.021

work page doi:10.1016/j.sigpro.2018.01.021 2018