pith. sign in

Slimmable neural networks

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it
abstract

We present a simple and general method to train a single neural network executable at different widths (number of channels in a layer), permitting instant and adaptive accuracy-efficiency trade-offs at runtime. Instead of training individual networks with different width configurations, we train a shared network with switchable batch normalization. At runtime, the network can adjust its width on the fly according to on-device benchmarks and resource constraints, rather than downloading and offloading different models. Our trained networks, named slimmable neural networks, achieve similar (and in many cases better) ImageNet classification accuracy than individually trained models of MobileNet v1, MobileNet v2, ShuffleNet and ResNet-50 at different widths respectively. We also demonstrate better performance of slimmable models compared with individual ones across a wide range of applications including COCO bounding-box object detection, instance segmentation and person keypoint detection without tuning hyper-parameters. Lastly we visualize and discuss the learned features of slimmable networks. Code and models are available at: https://github.com/JiahuiYu/slimmable_networks

citation-role summary

background 1 dataset 1

citation-polarity summary

years

2026 4 2024 1

verdicts

UNVERDICTED 5

representative citing papers

NaviSlim: Adaptive Context-Aware Navigation and Sensing via Dynamic Slimmable Networks

cs.RO · 2024-05-16 · unverdicted · novelty 6.0

NaviSlim uses a gated slimmable architecture to dynamically scale neural model complexity and onboard sensor power for context-aware navigation in micro-drones, reporting 57-92% average model reduction and 61-80% sensor utilization in AirSim simulations versus static full-complexity baselines.

Elastic Attention Cores for Scalable Vision Transformers

cs.CV · 2026-05-12 · unverdicted · novelty 6.0

VECA learns effective visual representations using core-periphery attention where patches interact exclusively via a resolution-invariant set of learned core embeddings, achieving linear O(N) complexity while maintaining competitive performance.

citing papers explorer

Showing 5 of 5 citing papers.