Empirical comparison finds CNNs more robust on small datasets with local textures while ViTs perform better on complex global scenes but require more data and compute.
Gradient -based learning applied to document recognition
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Vision Transformers and Convolutional Neural Networks for Land Use Scene Classification
Empirical comparison finds CNNs more robust on small datasets with local textures while ViTs perform better on complex global scenes but require more data and compute.