Accelerated Training for Massive Classification via Dynamic Class Selection
read the original abstract
Massive classification, a classification task defined over a vast number of classes (hundreds of thousands or even millions), has become an essential part of many real-world systems, such as face recognition. Existing methods, including the deep networks that achieved remarkable success in recent years, were mostly devised for problems with a moderate number of classes. They would meet with substantial difficulties, e.g. excessive memory demand and computational cost, when applied to massive problems. We present a new method to tackle this problem. This method can efficiently and accurately identify a small number of "active classes" for each mini-batch, based on a set of dynamic class hierarchies constructed on the fly. We also develop an adaptive allocation scheme thereon, which leads to a better tradeoff between performance and cost. On several large-scale benchmarks, our method significantly reduces the training cost and memory demand, while maintaining competitive performance.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
EarthSight: A Distributed Framework for Low-Latency Satellite Intelligence
EarthSight reduces average compute time per image by 1.9x and 90th-percentile end-to-end latency from 51 to 21 minutes by distributing inference decisions between orbit and ground with shared backbones and early rejec...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.