The devil is in the tails: Fine-grained classification in the wild

Grant Van Horn, Pietro Perona · 2017 · cs.CV · arXiv 1709.01450

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

The world is long-tailed. What does this mean for computer vision and visual recognition? The main two implications are (1) the number of categories we need to consider in applications can be very large, and (2) the number of training examples for most categories can be very small. Current visual recognition algorithms have achieved excellent classification accuracy. However, they require many training examples to reach peak performance, which suggests that long-tailed distributions will not be dealt with well. We analyze this question in the context of eBird, a large fine-grained classification dataset, and a state-of-the-art deep network classification algorithm. We find that (a) peak classification performance on well-represented categories is excellent, (b) given enough data, classification performance suffers only minimally from an increase in the number of classes, (c) classification performance decays precipitously as the number of training examples decreases, (d) surprisingly, transfer learning is virtually absent in current methods. Our findings suggest that our community should come to grips with the question of long tails.

representative citing papers

Cram Less to Fit More: Training Data Pruning Improves Memorization of Facts

cs.CL · 2026-04-09 · conditional · novelty 6.0

Loss-based pruning of training data to limit facts and flatten their frequency distribution enables a 110M-parameter GPT-2 model to memorize 1.3 times more entity facts than standard training, matching a 1.3B-parameter model on the full dataset.

Multi-Level Analyzation of Imbalance to Resolve Non-IID-Ness in Federated Learning

cs.LG · 2026-06-08 · unverdicted · novelty 4.0

FedBB addresses inter-case, inter-class, and inter-client imbalances in federated learning via Positive Negative Balanced loss and Client Balanced Reweighting, outperforming baselines on X-ray and natural image datasets while using limited statistics for privacy.

citing papers explorer

Showing 2 of 2 citing papers.

Cram Less to Fit More: Training Data Pruning Improves Memorization of Facts cs.CL · 2026-04-09 · conditional · none · ref 85
Loss-based pruning of training data to limit facts and flatten their frequency distribution enables a 110M-parameter GPT-2 model to memorize 1.3 times more entity facts than standard training, matching a 1.3B-parameter model on the full dataset.
Multi-Level Analyzation of Imbalance to Resolve Non-IID-Ness in Federated Learning cs.LG · 2026-06-08 · unverdicted · none · ref 48 · internal anchor
FedBB addresses inter-case, inter-class, and inter-client imbalances in federated learning via Positive Negative Balanced loss and Client Balanced Reweighting, outperforming baselines on X-ray and natural image datasets while using limited statistics for privacy.

The devil is in the tails: Fine-grained classification in the wild

fields

years

verdicts

representative citing papers

citing papers explorer