HASTE: A Framework for Training-Free, Dynamic, and Steerable Compression of Pre-Trained Convolutional Neural Networks

Alexandru Paul Condurache; Jens Mehnert; Lukas Meiner

arxiv: 2606.30516 · v1 · pith:FRSP3A5Znew · submitted 2026-06-29 · 💻 cs.CV

HASTE: A Framework for Training-Free, Dynamic, and Steerable Compression of Pre-Trained Convolutional Neural Networks

Lukas Meiner , Jens Mehnert , Alexandru Paul Condurache This is my paper

Pith reviewed 2026-06-30 06:24 UTC · model grok-4.3

classification 💻 cs.CV

keywords CNN compressiontraining-free inferencedynamic compressionlocality-sensitive hashingchannel mergingResNetCIFAR-10ImageNet

0 comments

The pith

HASTE uses locality-sensitive hashing to merge redundant channels patch-wise in pre-trained CNNs at inference, enabling dynamic compression without any retraining or data access.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces HASTE as a plug-and-play module that inserts into existing convolutional layers of pre-trained networks. At runtime it applies locality-sensitive hashing to feature-map patches to detect and merge similar channels, which shrinks both the input depth and the matching filter weights for cheaper matrix multiplications. Experiments report that this yields up to 46.2 percent fewer FLOPs on ResNet-34 with CIFAR-10 while losing only 1.25 percent accuracy, and the same pattern holds across several architectures on ImageNet. A reader would care because the method removes the usual requirements for fine-tuning or even access to the original training set, making large models runnable on constrained hardware without extra training cost.

Core claim

HASTE is a plug-and-play convolution module that at inference time uses locality-sensitive hashing to identify and merge redundant channels of latent feature maps on a patch-wise basis; this simultaneously compresses the depth of both input features and their corresponding filters, resulting in computationally cheaper convolutions that require no retraining.

What carries the argument

HASTE module performing locality-sensitive hashing for patch-wise redundant channel merging that shrinks both feature-map depth and filter depth inside standard convolutions.

If this is right

Pre-trained CNNs can be deployed on resource-limited devices by swapping in HASTE modules at inference without any additional training step.
Compression level becomes adjustable at runtime by changing the hashing threshold or patch size.
The same channel-merging idea applies across multiple CNN families including ResNet and others tested on both CIFAR-10 and ImageNet.
The approach avoids any need for the original training data during the compression step.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The patch-wise channel merging may generalize to other layer types such as depth-wise convolutions or attention blocks if similar redundancy patterns exist.
Runtime steering could be driven by external signals like battery level or latency targets without retraining the underlying model.
Because the paper links the scheme to token merging in transformers, a unified merging framework across CNNs and ViTs becomes a natural next direction to test.

Load-bearing premise

Locality-sensitive hashing can reliably detect redundant channels on patches so that merging them keeps the network accurate enough without any fine-tuning.

What would settle it

A controlled run on ResNet-34 with CIFAR-10 in which HASTE produces either more than a 3 percent accuracy drop or less than 20 percent FLOPs reduction while still using the same pre-trained weights.

read the original abstract

Deploying large convolutional neural networks (CNNs) on resource-constrained devices is challenging due to their high computational cost. While dynamic execution methods are promising, existing approaches for CNNs typically require specialized training or fine-tuning, limiting their effectiveness when applied to pre-trained models and requiring data access. To address this gap, we propose HASTE (Hashing for Tractable Efficiency), a plug-and-play convolution module that enables training-free, dynamic compression of large pre-trained CNNs. At inference time, HASTE uses locality-sensitive hashing to identify and merge redundant channels of latent feature maps on a patch-wise basis. This process simultaneously compresses the depth of both input features and their corresponding filters, resulting in computationally cheaper convolutions. We conduct extensive experiments on CIFAR-10 and ImageNet across a range of architectures, demonstrating a 46.2% FLOPs reduction in a ResNet34 on CIFAR-10 with only a 1.25% drop in accuracy, without any retraining. We support our claims by comprehensive ablation studies to validate our core design choices, an analysis of the method's properties and limitations, and a discussion that connects our channel merging scheme to the conceptually related task of token merging in Vision Transformers. Our results demonstrate that HASTE provides an effective solution for steerable compression of pre-trained CNNs at runtime, opening new possibilities for the deployment of efficient deep learning methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

HASTE shows a workable training-free route to dynamic CNN compression via patch-wise LSH channel merging, with reported numbers that look usable but need full verification.

read the letter

The core claim is that a plug-and-play convolution module can merge redundant channels on the fly using locality-sensitive hashing, cutting FLOPs on pre-trained models without retraining or data access. This is the part that stands out: the specific patch-wise application inside CNN layers, plus the explicit tie to token merging from ViTs.

The experiments report a 46% FLOPs drop on ResNet34 with CIFAR-10 at 1.25% accuracy loss, plus runs on ImageNet and other architectures. They include ablations and a limitations section, which is better than many compression papers. That empirical package is the main strength.

The soft spot is the reliance on LSH parameters and the assumption that hashing reliably finds mergeable channels without hidden accuracy costs across models. The paper treats this as an empirical question and discusses it, but the results still rest on those choices rather than a parameter-free guarantee. Reproducibility will depend on whether the implementation details and hash settings are fully specified.

This is aimed at people working on edge deployment where retraining is off the table. A reader focused on practical inference optimization would find the method and the comparisons worth looking at.

I would send it for peer review. The gap it targets is real and the reported gains are large enough to justify checking the details.

Referee Report

0 major / 3 minor

Summary. The paper proposes HASTE, a plug-and-play convolution module that enables training-free, dynamic, and steerable compression of pre-trained CNNs. At inference, locality-sensitive hashing identifies and merges redundant channels in latent feature maps on a patch-wise basis, simultaneously reducing the depth of input features and corresponding filters to yield cheaper convolutions. Experiments across CIFAR-10 and ImageNet on multiple architectures report e.g. a 46.2% FLOPs reduction for ResNet34 on CIFAR-10 with a 1.25% accuracy drop and no retraining; the work includes ablation studies validating design choices, analysis of method properties and limitations, and a discussion relating the channel-merging scheme to token merging in Vision Transformers.

Significance. If the empirical results hold under scrutiny, the contribution is significant because it closes a practical gap: existing dynamic CNN execution methods typically require specialized training, fine-tuning, or data access, whereas HASTE operates on frozen pre-trained models with no data or gradient updates. The steerable and dynamic nature at runtime, combined with the explicit connection to token merging, adds conceptual value. Credit is given for the extensive benchmark coverage, ablation studies, and the training-free claim being directly supported by the reported measurements rather than fitted parameters.

minor comments (3)

[Abstract] Abstract: the phrase 'comprehensive ablation studies' would benefit from a brief enumeration of the specific design choices ablated (e.g., hash-function count, patch size, merging threshold) to allow readers to assess coverage immediately.
The description of the LSH-based merging procedure would be clearer if the precise definition of a 'patch' (spatial support and stride) and the exact criterion for declaring two channels redundant were stated in a single equation or pseudocode block early in the methods.
Figure captions and axis labels should explicitly state whether reported accuracy/FLOPs numbers are means over multiple random seeds or single runs; this is a minor but recurring clarity issue for reproducibility.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive review, accurate summary of our contributions, and recommendation of minor revision. The significance assessment aligns with our claims regarding the training-free operation on frozen models.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper introduces HASTE as an empirical plug-and-play module relying on locality-sensitive hashing for patch-wise channel merging in pre-trained CNNs. All central claims (FLOPs reduction, accuracy preservation) are supported by experimental results on CIFAR-10 and ImageNet, ablations, and architecture-specific measurements rather than any derivation, equation, or parameter fit that reduces to its own inputs by construction. No self-citation chains, uniqueness theorems, or ansatzes are invoked as load-bearing premises. The method is self-contained against external benchmarks via reported empirical outcomes.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The central claim depends on the empirical effectiveness of LSH for identifying mergeable channels; no explicit free parameters or invented physical entities are stated in the abstract.

free parameters (1)

LSH parameters (number of hash functions, bucket thresholds)
Control similarity detection and merging decisions; values not reported in abstract.

axioms (1)

domain assumption Locality-sensitive hashing preserves enough similarity structure in CNN feature channels to allow safe merging
Invoked by the core merging step described in the abstract.

invented entities (1)

HASTE convolution module no independent evidence
purpose: Plug-and-play component that performs dynamic channel merging
New module introduced by the paper; no independent evidence outside the reported experiments.

pith-pipeline@v0.9.1-grok · 5797 in / 1310 out tokens · 38166 ms · 2026-06-30T06:24:54.973335+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

67 extracted references · 38 canonical work pages · 2 internal anchors

[1]

In: Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, pp

Meiner, L., Mehnert, J., Condurache, A.: Data-Free Dynamic Compression of CNNs for Tractable Efficiency. In: Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, pp. 196–208. SCITEPRESS - Science and Technology Publications, S´ etubal, Portugal (2025). https://doi.org/10.5220/001...

work page doi:10.5220/0013301000003912 2025
[2]

In: IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pp

Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: MobileNetV2: Inverted Residuals and Linear Bottlenecks. In: IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pp. 4510–4520. IEEE Computer Society, Los Alamitos, CA, USA (2018). https://doi.org/10.1109/CVPR.2018. 00474

work page doi:10.1109/cvpr.2018 2018
[3]

In: Pro- ceedings of the 38th International Conference on Machine Learning

Tan, M., Le, Q.: EfficientNetV2: Smaller Models and Faster Training. In: Pro- ceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 139, pp. 10096–10106. PMLR, Online (2021)

2021
[4]

In: Proceedings of the European Confer- ence on Computer Vision (ECCV), pp

Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. In: Proceedings of the European Confer- ence on Computer Vision (ECCV), pp. 122–138 (2018). https://doi.org/10.1007/ 978-3-030-01264-9 8 24

2018
[5]

In: International Conference on Learning Representations (ICLR) (2015)

Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large-Scale Image Recognition. In: International Conference on Learning Representations (ICLR) (2015)

2015
[6]

In: Proc

He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recogni- tion. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90

work page doi:10.1109/cvpr.2016.90 2016
[7]

In: International Conference on Learning Representations (ICLR) (2021)

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In: International Conference on Learning Representations (ICLR) (2021)

2021
[8]

In: International Joint Conference on Neural Networks (IJCNN), pp

Wimmer, P., Mehnert, J., Condurache, A.: COPS: Controlled Pruning Before Training Starts. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9533582

work page doi:10.1109/ijcnn52387.2021.9533582 2021
[9]

In: International Conference on Learning Representations (ICLR) (2016)

Han, S., Mao, H., Dally, W.J.: Deep Compression: Compressing Deep Neural Net- work with Pruning, Trained Quantization and Huffman Coding. In: International Conference on Learning Representations (ICLR) (2016)

2016
[10]

Rope3D: The Roadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task,

Wimmer, P., Mehnert, J., Condurache, A.: Interspace Pruning: Using Adaptive Filter Representations To Improve Training of Sparse CNNs. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12527– 12537 (2022). https://doi.org/10.1109/CVPR52688.2022.01220

work page doi:10.1109/cvpr52688.2022.01220 2022
[11]

Anwar, S., Hwang, K., Sung, W.: Structured Pruning of Deep Convolutional Neural Networks. J. Emerg. Technol. Comput. Syst.13(3) (2017) https://doi. org/10.1145/3005348

work page doi:10.1145/3005348 2017
[12]

In: International Conference on Learning Representations (ICLR) (2017)

Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H.P.: Pruning Filters for Efficient ConvNets. In: International Conference on Learning Representations (ICLR) (2017)

2017
[13]

Pattern Recognition115, 107899 (2021) https://doi.org/10.1016/ J.PATCOG.2021.107899

Yeom, S.-K., Seegerer, P., Lapuschkin, S., Binder, A., Wiedemann, S., M¨ uller, K.-R., Samek, W.: Pruning by explaining: A novel criterion for deep neural net- work pruning. Pattern Recognition115, 107899 (2021) https://doi.org/10.1016/ J.PATCOG.2021.107899

work page arXiv 2021
[14]

In: IEEE International Con- ference on Computer Vision (ICCV), pp

Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning Efficient Convolutional Networks through Network Slimming. In: IEEE International Con- ference on Computer Vision (ICCV), pp. 2755–2763. IEEE Computer Society, Los Alamitos, CA, USA (2017). https://doi.org/10.1109/ICCV.2017.298

work page doi:10.1109/iccv.2017.298 2017
[15]

In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

He, Y., Liu, P., Wang, Z., Hu, Z., Yang, Y.: Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4335–4344 (2019). 25 https://doi.org/10.1109/CVPR.2019.00447

work page doi:10.1109/cvpr.2019.00447 2019
[16]

In: Pro- ceedings of the 32nd International Conference on Neural Information Processing Systems, pp

Zhuang, Z., Tan, M., Zhuang, B., Liu, J., Guo, Y., Wu, Q., Huang, J., Zhu, J.: Discrimination-Aware Channel Pruning for Deep Neural Networks. In: Pro- ceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 883–894 (2018)

2018
[17]

In: International Joint Conference on Neural Networks (IJCNN), pp

Xu, Z., Sun, J., Liu, Y., Sun, G.: An Efficient Channel-level Pruning for CNNs without Fine-tuning. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9533397

work page doi:10.1109/ijcnn52387.2021.9533397 2021
[18]

In: Proceedings of the Asian Conference on Computer Vision (ACCV), pp

Cakaj, R., Mehnert, J., Yang, B.: CNN Mixture-of-Depths. In: Proceedings of the Asian Conference on Computer Vision (ACCV), pp. 3480–3498 (2024). https: //doi.org/10.1007/978-981-96-0963-5 9

work page doi:10.1007/978-981-96-0963-5 2024
[19]

In: International Conference on Learning Representations (ICLR) (2020)

Bejnordi, B.E., Blankevoort, T., Welling, M.: Batch-Shaping for Learning Con- ditional Channel Gated Networks. In: International Conference on Learning Representations (ICLR) (2020)

2020
[20]

In: Advances in Neural Information Processing Systems, vol

Hua, W., Zhou, Y., De Sa, C., Zhang, Z., Suh, G.E.: Channel Gating Neural Networks. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

2019
[21]

URL https : / / openaccess

Verelst, T., Tuytelaars, T.: Dynamic Convolutions: Exploiting Spatial Sparsity for Faster Inference. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2317–2326. IEEE Computer Society, Los Alamitos, CA, USA (2020). https://doi.org/10.1109/CVPR42600.2020.00239

work page doi:10.1109/cvpr42600.2020.00239 2020
[22]

Caron, H

Li, F., Li, G., He, X., Cheng, J.: Dynamic Dual Gating Neural Networks. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5310–5319 (2021). https://doi.org/10.1109/ICCV48922.2021.00528

work page doi:10.1109/iccv48922.2021.00528 2021
[23]

In: International Conference on Learning Representations (ICLR) (2019)

Liu, L., Deng, L., Hu, X., Zhu, M., Li, G., Ding, Y., Xie, Y.: Dynamic Sparse Graph for Efficient Deep Learning. In: International Conference on Learning Representations (ICLR) (2019)

2019
[24]

Rope3D: The Roadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task,

Elkerdawy, S., Elhoushi, M., Zhang, H., Ray, N.: Fire Together Wire Together: A Dynamic Pruning Approach with Self-Supervised Mask Prediction. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022). https://doi.org/10.1109/CVPR52688.2022.01213

work page doi:10.1109/cvpr52688.2022.01213 2022
[25]

arXiv preprint arXiv:2505.03254 (2025) https://doi.org/10.48550/arXiv.2505.03254

Meiner, L., Mehnert, J., Condurache, A.P.: PROM: Prioritize Reduction of Multiplications Over Lower Bit-Widths for Efficient CNNs. arXiv preprint arXiv:2505.03254 (2025) https://doi.org/10.48550/arXiv.2505.03254

work page doi:10.48550/arxiv.2505.03254 2025
[26]

In: ECCV (2022)

Kim, H.-B., Park, E., Yoo, S.: BASQ: Branch-wise Activation-clipping Search Quantization for Sub-4-bit Neural Networks. In: ECCV (2022). https://doi.org/ 10.1007/978-3-031-19775-8 2 26

work page doi:10.1007/978-3-031-19775-8 2022
[27]

In: IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR), pp

Zhu, F., Gong, R., Yu, F., Liu, X., Wang, Y., Li, Z., Yang, X., Yan, J.: Towards Unified INT8 Training for Convolutional Neural Network. In: IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR), pp. 1966–1976 (2020)

1966
[28]

In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pp

Berisha, U., Mehnert, J., Condurache, A.P.: Efficient Data Driven Mixture-of- Expert Extraction from Trained Networks. In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pp. 20082–20091 (2025)

2025
[29]

In: International Conference on Learning Representations (ICLR) (2017)

Shazeer, N., Mirhoseini, A., Maziarz, K., Davis, A., Le, Q.V., Hinton, G.E., Dean, J.: Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. In: International Conference on Learning Representations (ICLR) (2017)

2017
[30]

Fedus, W., Zoph, B., Shazeer, N.: Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. J. Mach. Learn. Res.23 (2022)

2022
[31]

arXiv preprint arXiv:2308.14711 (2023) https://doi.org/10.48550/arXiv.2308.14711

Belcak, P., Wattenhofer, R.: Fast Feedforward Networks. arXiv preprint arXiv:2308.14711 (2023) https://doi.org/10.48550/arXiv.2308.14711

work page doi:10.48550/arxiv.2308.14711 2023
[33]

Inter- national Journal of Computer Vision129(6), 1789–1819 (2021) https://doi.org/ 10.1007/S11263-021-01453-Z

Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: A survey. Inter- national Journal of Computer Vision129(6), 1789–1819 (2021) https://doi.org/ 10.1007/S11263-021-01453-Z

work page doi:10.1007/s11263-021-01453-z 2021
[34]

In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp

Dong, X., Huang, J., Yang, Y., Yan, S.: More is Less: A More Complicated Net- work with Less Inference Complexity. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1895–1903. IEEE Computer Society, Los Alamitos, CA, USA (2017). https://doi.org/10.1109/CVPR.2017.205

work page doi:10.1109/cvpr.2017.205 1903
[35]

In: International Conference on Learning Representations (ICLR) (2019)

Gao, X., Zhao, Y., Dudziak, L., Mullins, R., Xu, C.-Z.: Dynamic Channel Prun- ing: Feature Boosting and Suppression. In: International Conference on Learning Representations (ICLR) (2019)

2019
[36]

In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp

Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H.: Learning Structured Sparsity in Deep Neural Networks. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 2082–2090 (2016)

2082
[37]

In: Computer Vision – ECCV 2018, pp

He, Y., Lin, J., Liu, Z., Wang, H., Li, L.-J., Han, S.: AMC: AutoML for Model Compression and Acceleration on Mobile Devices. In: Computer Vision – ECCV 2018, pp. 815–832 (2018). https://doi.org/10.1007/978-3-030-01234-2 48

work page doi:10.1007/978-3-030-01234-2 2018
[38]

In: 27 International Conference on Learning Representations (ICLR) (2014)

Goodfellow, I.J., Mirza, M., Da, X., Courville, A.C., Bengio, Y.: An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks. In: 27 International Conference on Learning Representations (ICLR) (2014)

2014
[39]

In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing

Indyk, P., Motwani, R.: Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing. STOC ’98, pp. 604–613. Association for Computing Machinery, New York, NY, USA (1998). https://doi.org/10.1145/ 276698.276876

work page arXiv 1998
[40]

URL https : / / openaccess

Yin, H., Molchanov, P., Alvarez, J.M., Li, Z., Mallya, A., Hoiem, D., Jha, N.K., Kautz, J.: Dreaming to Distill: Data-free Knowledge Transfer via Deepin- version. In: IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion (CVPR), pp. 8715–8724 (2020). https://doi.org/10.1109/CVPR42600.2020. 00874

work page doi:10.1109/cvpr42600.2020 2020
[41]

IEEE Trans- actions on Pattern Analysis & Machine Intelligence45(03), 3664–3676 (2023) https://doi.org/10.1109/TPAMI.2022.3179616

Yvinec, E., Dapogny, A., Cord, M., Bailly, K.: RED++ : Data-Free Pruning of Deep Neural Networks via Input Splitting and Output Merging. IEEE Trans- actions on Pattern Analysis & Machine Intelligence45(03), 3664–3676 (2023) https://doi.org/10.1109/TPAMI.2022.3179616

work page doi:10.1109/tpami.2022.3179616 2023
[42]

InProceedings of the SIGGRAPH Asia 2025 Conference Papers (SA Conference Papers ’25)

Bai, S., Chen, J., Shen, X., Qian, Y., Liu, Y.: Unified Data-Free Compres- sion: Pruning and Quantization without Fine-Tuning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5876–5885 (2023). https://doi.org/10.1109/ICCV51070.2023.00540

work page doi:10.1109/iccv51070.2023.00540 2023
[43]

Journal of Computer and System Sciences , Year =

Achlioptas, D.: Database-friendly random projections: Johnson-Lindenstrauss with binary coins. Journal of Computer and System Sciences66(4), 671–687 (2003) https://doi.org/10.1016/S0022-0000(03)00025-4

work page doi:10.1016/s0022-0000(03)00025-4 2003
[44]

In: International Conference on Learning Representations (ICLR) (2020)

Kitaev, N., Kaiser, L., Levskaya, A.: Reformer: The Efficient Transformer. In: International Conference on Learning Representations (ICLR) (2020)

2020
[45]

In: Advances in Neural Information Processing Systems, vol

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is All you Need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

2017
[46]

In: Proceedings of Machine Learning and Systems, vol

Chen, B., Medini, T., Farwell, J., Gobriel, S., Tai, C., Shrivastava, A.: SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems. In: Proceedings of Machine Learning and Systems, vol. 2, pp. 291–306 (2020)

2020
[47]

In: International Conference on Learning Representations (ICLR) (2021)

Chen, B., Liu, Z., Peng, B., Xu, Z., Li, J.L., Dao, T., Song, Z., Shrivastava, A., Re, C.: MONGOOSE: A Learnable LSH Framework for Efficient Neural Net- work Training. In: International Conference on Learning Representations (ICLR) (2021)

2021
[48]

ACM Trans

M¨ uller, T., Evans, A., Schied, C., Keller, A.: Instant Neural Graphics Primitives with a Multiresolution Hash Encoding. ACM Trans. Graph.41(4) (2022) https: 28 //doi.org/10.1145/3528223.3530127

work page doi:10.1145/3528223.3530127 2022
[49]

2410.07299

Liu, Z., Coleman, B., Shrivastava, A.: Efficient Inference via Universal LSH Kernel. arXiv preprint arXiv:2106.11426 (2021) https://doi.org/10.48550/arXiv. 2106.11426

work page internal anchor Pith review doi:10.48550/arxiv 2021
[50]

Induced and reduced unbounded operator algebras

Liu, Z., Wang, P., Li, Z.: More-Similar-Less-Important: Filter Pruning VIA Kmeans Clustering. In: IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2021). https://doi.org/10.1109/ICME51207.2021.9428286

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1109/icme51207.2021.9428286 2021
[51]

Rope3D: The Roadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task,

Yin, H., Vahdat, A., Alvarez, J., Mallya, A., Kautz, J., Molchanov, P.: A- ViT: Adaptive Tokens for Efficient Vision Transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022). https://doi.org/10.1109/CVPR52688.2022.01054

work page doi:10.1109/cvpr52688.2022.01054 2022
[52]

In: Advances in Neural Information Processing Systems (NeurIPS) (2021)

Rao, Y., Zhao, W., Liu, B., Lu, J., Zhou, J., Hsieh, C.-J.: DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification. In: Advances in Neural Information Processing Systems (NeurIPS) (2021)

2021
[53]

In: International Conference on Learning Representations (ICLR) (2022)

Liang, Y., Ge, C., Tong, Z., Song, Y., Wang, J., Xie, P.: Not All Patches are What You Need: Expediting Vision Transformers via Token Reorganizations. In: International Conference on Learning Representations (ICLR) (2022)

2022
[54]

InProceedings of the SIGGRAPH Asia 2025 Conference Papers (SA Conference Papers ’25)

Chen, M., Shao, W., Xu, P., Lin, M., Zhang, K., Chao, F., Ji, R., Qiao, Y., Luo, P.: DiffRate: Differentiable Compression Rate for Efficient Vision Transformers. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 17164– 17174 (2023). https://doi.org/10.1109/ICCV51070.2023.01574

work page doi:10.1109/iccv51070.2023.01574 2023
[55]

arXiv preprint arXiv:2505.15160 (2025) https://doi.org/10

Lee, J., Choi, D.-W.: Lossless Token Merging Even Without Fine-Tuning in Vision Transformers. arXiv preprint arXiv:2505.15160 (2025) https://doi.org/10. 48550/arXiv.2505.15160

work page arXiv 2025
[56]

In: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp

Kim, M., Gao, S., Hsu, Y.-C., Shen, Y., Jin, H.: Token Fusion: Bridging the Gap between Token Pruning and Token Merging. In: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 1372–1381. IEEE Computer Society, Los Alamitos, CA, USA (2024). https://doi.org/10.1109/WACV57701. 2024.00141

work page doi:10.1109/wacv57701 2024
[57]

In: International Conference on Learning Representations (ICLR) (2023)

Bolya, D., Fu, C.-Y., Dai, X., Zhang, P., Feichtenhofer, C., Hoffman, J.: Token Merging: Your ViT but Faster. In: International Conference on Learning Representations (ICLR) (2023)

2023
[58]

In: Proceedings of the 25th International Conference on Very Large Data Bases

Gionis, A., Indyk, P., Motwani, R.: Similarity Search in High Dimensions via Hashing. In: Proceedings of the 25th International Conference on Very Large Data Bases. VLDB ’99, pp. 518–529. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1999) 29

1999
[59]

In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Li, P., Hastie, T., Church, K.: Very Sparse Random Projections. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’06, vol. 2006, pp. 287–296 (2006). https://doi.org/10.1145/ 1150402.1150436

work page arXiv 2006
[60]

Univer- sity of Toronto (2009)

Krizhevsky, A.: Learning Multiple Layers of Features from Tiny Images. Univer- sity of Toronto (2009)

2009
[61]

International Journal of Computer Vision 115(3), 211–252 (2015) https://doi.org/10.1007/s11263-015-0816-y

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A., Fei-Fei, L.: ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision 115(3), 211–252 (2015) https://doi.org/10.1007/s11263-015-0816-y

work page doi:10.1007/s11263-015-0816-y 2015
[62]

https://doi.org/10

Phan, H.: PyTorch Models Trained on CIFAR-10 Dataset. https://doi.org/10. 5281/zenodo.4431043 . https://github.com/huyvnphan/PyTorch CIFAR10
[63]

In: Advances in Neural Information Processing Systems, pp

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., K¨ opf, A., Yang, E.Z., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., Chintala, S.: PyTorch: An Imperative Style, High-Performance Deep Learning Library. In: Advances in Neural Info...

2019
[64]

In: Proceedings of the British Machine Vision Conference (BMVC) (2016)

Zagoruyko, S., Komodakis, N.: Wide Residual Networks. In: Proceedings of the British Machine Vision Conference (BMVC) (2016)

2016
[65]

arXiv preprint arXiv:2311.10770 (2023) https://doi.org/10.48550/arXiv.2311.10770

Belcak, P., Wattenhofer, R.: Exponentially Faster Language Modelling. arXiv preprint arXiv:2311.10770 (2023) https://doi.org/10.48550/arXiv.2311.10770

work page doi:10.48550/arxiv.2311.10770 2023
[66]

In: Design, Automa- tion & Test in Europe Conference & Exhibition (DATE), pp

Vogel, S., Schorn, C., Guntoro, A., Ascheid, G.: Guaranteed Compression Rate for Activations in CNNs using a Frequency Pruning Approach. In: Design, Automa- tion & Test in Europe Conference & Exhibition (DATE), pp. 296–299 (2019). https://doi.org/10.23919/DATE.2019.8715210

work page doi:10.23919/date.2019.8715210 2019
[67]

In: European Conference on Computer Vision (ECCV), pp

Zeiler, M.D., Fergus, R.: Visualizing and Understanding Convolutional Networks. In: European Conference on Computer Vision (ECCV), pp. 818–833 (2014). https: //doi.org/10.1007/978-3-319-10590-1 53

work page doi:10.1007/978-3-319-10590-1 2014
[68]

arXiv preprint arXiv:2309.17211 (2023) https://doi.org/10.48550/arXiv.2309.17211 30

Meiner, L., Mehnert, J., Condurache, A.P.: Instant Complexity Reduction in CNNs using Locality-Sensitive Hashing. arXiv preprint arXiv:2309.17211 (2023) https://doi.org/10.48550/arXiv.2309.17211 30

work page doi:10.48550/arxiv.2309.17211 2023

[1] [1]

In: Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, pp

Meiner, L., Mehnert, J., Condurache, A.: Data-Free Dynamic Compression of CNNs for Tractable Efficiency. In: Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, pp. 196–208. SCITEPRESS - Science and Technology Publications, S´ etubal, Portugal (2025). https://doi.org/10.5220/001...

work page doi:10.5220/0013301000003912 2025

[2] [2]

In: IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pp

Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: MobileNetV2: Inverted Residuals and Linear Bottlenecks. In: IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pp. 4510–4520. IEEE Computer Society, Los Alamitos, CA, USA (2018). https://doi.org/10.1109/CVPR.2018. 00474

work page doi:10.1109/cvpr.2018 2018

[3] [3]

In: Pro- ceedings of the 38th International Conference on Machine Learning

Tan, M., Le, Q.: EfficientNetV2: Smaller Models and Faster Training. In: Pro- ceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 139, pp. 10096–10106. PMLR, Online (2021)

2021

[4] [4]

In: Proceedings of the European Confer- ence on Computer Vision (ECCV), pp

Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. In: Proceedings of the European Confer- ence on Computer Vision (ECCV), pp. 122–138 (2018). https://doi.org/10.1007/ 978-3-030-01264-9 8 24

2018

[5] [5]

In: International Conference on Learning Representations (ICLR) (2015)

Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large-Scale Image Recognition. In: International Conference on Learning Representations (ICLR) (2015)

2015

[6] [6]

In: Proc

He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recogni- tion. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90

work page doi:10.1109/cvpr.2016.90 2016

[7] [7]

In: International Conference on Learning Representations (ICLR) (2021)

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In: International Conference on Learning Representations (ICLR) (2021)

2021

[8] [8]

In: International Joint Conference on Neural Networks (IJCNN), pp

Wimmer, P., Mehnert, J., Condurache, A.: COPS: Controlled Pruning Before Training Starts. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9533582

work page doi:10.1109/ijcnn52387.2021.9533582 2021

[9] [9]

In: International Conference on Learning Representations (ICLR) (2016)

Han, S., Mao, H., Dally, W.J.: Deep Compression: Compressing Deep Neural Net- work with Pruning, Trained Quantization and Huffman Coding. In: International Conference on Learning Representations (ICLR) (2016)

2016

[10] [10]

Rope3D: The Roadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task,

Wimmer, P., Mehnert, J., Condurache, A.: Interspace Pruning: Using Adaptive Filter Representations To Improve Training of Sparse CNNs. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12527– 12537 (2022). https://doi.org/10.1109/CVPR52688.2022.01220

work page doi:10.1109/cvpr52688.2022.01220 2022

[11] [11]

Anwar, S., Hwang, K., Sung, W.: Structured Pruning of Deep Convolutional Neural Networks. J. Emerg. Technol. Comput. Syst.13(3) (2017) https://doi. org/10.1145/3005348

work page doi:10.1145/3005348 2017

[12] [12]

In: International Conference on Learning Representations (ICLR) (2017)

Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H.P.: Pruning Filters for Efficient ConvNets. In: International Conference on Learning Representations (ICLR) (2017)

2017

[13] [13]

Pattern Recognition115, 107899 (2021) https://doi.org/10.1016/ J.PATCOG.2021.107899

Yeom, S.-K., Seegerer, P., Lapuschkin, S., Binder, A., Wiedemann, S., M¨ uller, K.-R., Samek, W.: Pruning by explaining: A novel criterion for deep neural net- work pruning. Pattern Recognition115, 107899 (2021) https://doi.org/10.1016/ J.PATCOG.2021.107899

work page arXiv 2021

[14] [14]

In: IEEE International Con- ference on Computer Vision (ICCV), pp

Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning Efficient Convolutional Networks through Network Slimming. In: IEEE International Con- ference on Computer Vision (ICCV), pp. 2755–2763. IEEE Computer Society, Los Alamitos, CA, USA (2017). https://doi.org/10.1109/ICCV.2017.298

work page doi:10.1109/iccv.2017.298 2017

[15] [15]

In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

He, Y., Liu, P., Wang, Z., Hu, Z., Yang, Y.: Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4335–4344 (2019). 25 https://doi.org/10.1109/CVPR.2019.00447

work page doi:10.1109/cvpr.2019.00447 2019

[16] [16]

In: Pro- ceedings of the 32nd International Conference on Neural Information Processing Systems, pp

Zhuang, Z., Tan, M., Zhuang, B., Liu, J., Guo, Y., Wu, Q., Huang, J., Zhu, J.: Discrimination-Aware Channel Pruning for Deep Neural Networks. In: Pro- ceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 883–894 (2018)

2018

[17] [17]

In: International Joint Conference on Neural Networks (IJCNN), pp

Xu, Z., Sun, J., Liu, Y., Sun, G.: An Efficient Channel-level Pruning for CNNs without Fine-tuning. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9533397

work page doi:10.1109/ijcnn52387.2021.9533397 2021

[18] [18]

In: Proceedings of the Asian Conference on Computer Vision (ACCV), pp

Cakaj, R., Mehnert, J., Yang, B.: CNN Mixture-of-Depths. In: Proceedings of the Asian Conference on Computer Vision (ACCV), pp. 3480–3498 (2024). https: //doi.org/10.1007/978-981-96-0963-5 9

work page doi:10.1007/978-981-96-0963-5 2024

[19] [19]

In: International Conference on Learning Representations (ICLR) (2020)

Bejnordi, B.E., Blankevoort, T., Welling, M.: Batch-Shaping for Learning Con- ditional Channel Gated Networks. In: International Conference on Learning Representations (ICLR) (2020)

2020

[20] [20]

In: Advances in Neural Information Processing Systems, vol

Hua, W., Zhou, Y., De Sa, C., Zhang, Z., Suh, G.E.: Channel Gating Neural Networks. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

2019

[21] [21]

URL https : / / openaccess

Verelst, T., Tuytelaars, T.: Dynamic Convolutions: Exploiting Spatial Sparsity for Faster Inference. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2317–2326. IEEE Computer Society, Los Alamitos, CA, USA (2020). https://doi.org/10.1109/CVPR42600.2020.00239

work page doi:10.1109/cvpr42600.2020.00239 2020

[22] [22]

Caron, H

Li, F., Li, G., He, X., Cheng, J.: Dynamic Dual Gating Neural Networks. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5310–5319 (2021). https://doi.org/10.1109/ICCV48922.2021.00528

work page doi:10.1109/iccv48922.2021.00528 2021

[23] [23]

In: International Conference on Learning Representations (ICLR) (2019)

Liu, L., Deng, L., Hu, X., Zhu, M., Li, G., Ding, Y., Xie, Y.: Dynamic Sparse Graph for Efficient Deep Learning. In: International Conference on Learning Representations (ICLR) (2019)

2019

[24] [24]

Rope3D: The Roadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task,

Elkerdawy, S., Elhoushi, M., Zhang, H., Ray, N.: Fire Together Wire Together: A Dynamic Pruning Approach with Self-Supervised Mask Prediction. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022). https://doi.org/10.1109/CVPR52688.2022.01213

work page doi:10.1109/cvpr52688.2022.01213 2022

[25] [25]

arXiv preprint arXiv:2505.03254 (2025) https://doi.org/10.48550/arXiv.2505.03254

Meiner, L., Mehnert, J., Condurache, A.P.: PROM: Prioritize Reduction of Multiplications Over Lower Bit-Widths for Efficient CNNs. arXiv preprint arXiv:2505.03254 (2025) https://doi.org/10.48550/arXiv.2505.03254

work page doi:10.48550/arxiv.2505.03254 2025

[26] [26]

In: ECCV (2022)

Kim, H.-B., Park, E., Yoo, S.: BASQ: Branch-wise Activation-clipping Search Quantization for Sub-4-bit Neural Networks. In: ECCV (2022). https://doi.org/ 10.1007/978-3-031-19775-8 2 26

work page doi:10.1007/978-3-031-19775-8 2022

[27] [27]

In: IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR), pp

Zhu, F., Gong, R., Yu, F., Liu, X., Wang, Y., Li, Z., Yang, X., Yan, J.: Towards Unified INT8 Training for Convolutional Neural Network. In: IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR), pp. 1966–1976 (2020)

1966

[28] [28]

In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pp

Berisha, U., Mehnert, J., Condurache, A.P.: Efficient Data Driven Mixture-of- Expert Extraction from Trained Networks. In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pp. 20082–20091 (2025)

2025

[29] [29]

In: International Conference on Learning Representations (ICLR) (2017)

Shazeer, N., Mirhoseini, A., Maziarz, K., Davis, A., Le, Q.V., Hinton, G.E., Dean, J.: Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. In: International Conference on Learning Representations (ICLR) (2017)

2017

[30] [30]

Fedus, W., Zoph, B., Shazeer, N.: Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. J. Mach. Learn. Res.23 (2022)

2022

[31] [31]

arXiv preprint arXiv:2308.14711 (2023) https://doi.org/10.48550/arXiv.2308.14711

Belcak, P., Wattenhofer, R.: Fast Feedforward Networks. arXiv preprint arXiv:2308.14711 (2023) https://doi.org/10.48550/arXiv.2308.14711

work page doi:10.48550/arxiv.2308.14711 2023

[32] [33]

Inter- national Journal of Computer Vision129(6), 1789–1819 (2021) https://doi.org/ 10.1007/S11263-021-01453-Z

Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: A survey. Inter- national Journal of Computer Vision129(6), 1789–1819 (2021) https://doi.org/ 10.1007/S11263-021-01453-Z

work page doi:10.1007/s11263-021-01453-z 2021

[33] [34]

In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp

Dong, X., Huang, J., Yang, Y., Yan, S.: More is Less: A More Complicated Net- work with Less Inference Complexity. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1895–1903. IEEE Computer Society, Los Alamitos, CA, USA (2017). https://doi.org/10.1109/CVPR.2017.205

work page doi:10.1109/cvpr.2017.205 1903

[34] [35]

In: International Conference on Learning Representations (ICLR) (2019)

Gao, X., Zhao, Y., Dudziak, L., Mullins, R., Xu, C.-Z.: Dynamic Channel Prun- ing: Feature Boosting and Suppression. In: International Conference on Learning Representations (ICLR) (2019)

2019

[35] [36]

In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp

Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H.: Learning Structured Sparsity in Deep Neural Networks. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 2082–2090 (2016)

2082

[36] [37]

In: Computer Vision – ECCV 2018, pp

He, Y., Lin, J., Liu, Z., Wang, H., Li, L.-J., Han, S.: AMC: AutoML for Model Compression and Acceleration on Mobile Devices. In: Computer Vision – ECCV 2018, pp. 815–832 (2018). https://doi.org/10.1007/978-3-030-01234-2 48

work page doi:10.1007/978-3-030-01234-2 2018

[37] [38]

In: 27 International Conference on Learning Representations (ICLR) (2014)

Goodfellow, I.J., Mirza, M., Da, X., Courville, A.C., Bengio, Y.: An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks. In: 27 International Conference on Learning Representations (ICLR) (2014)

2014

[38] [39]

In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing

Indyk, P., Motwani, R.: Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing. STOC ’98, pp. 604–613. Association for Computing Machinery, New York, NY, USA (1998). https://doi.org/10.1145/ 276698.276876

work page arXiv 1998

[39] [40]

URL https : / / openaccess

Yin, H., Molchanov, P., Alvarez, J.M., Li, Z., Mallya, A., Hoiem, D., Jha, N.K., Kautz, J.: Dreaming to Distill: Data-free Knowledge Transfer via Deepin- version. In: IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion (CVPR), pp. 8715–8724 (2020). https://doi.org/10.1109/CVPR42600.2020. 00874

work page doi:10.1109/cvpr42600.2020 2020

[40] [41]

IEEE Trans- actions on Pattern Analysis & Machine Intelligence45(03), 3664–3676 (2023) https://doi.org/10.1109/TPAMI.2022.3179616

Yvinec, E., Dapogny, A., Cord, M., Bailly, K.: RED++ : Data-Free Pruning of Deep Neural Networks via Input Splitting and Output Merging. IEEE Trans- actions on Pattern Analysis & Machine Intelligence45(03), 3664–3676 (2023) https://doi.org/10.1109/TPAMI.2022.3179616

work page doi:10.1109/tpami.2022.3179616 2023

[41] [42]

InProceedings of the SIGGRAPH Asia 2025 Conference Papers (SA Conference Papers ’25)

Bai, S., Chen, J., Shen, X., Qian, Y., Liu, Y.: Unified Data-Free Compres- sion: Pruning and Quantization without Fine-Tuning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5876–5885 (2023). https://doi.org/10.1109/ICCV51070.2023.00540

work page doi:10.1109/iccv51070.2023.00540 2023

[42] [43]

Journal of Computer and System Sciences , Year =

Achlioptas, D.: Database-friendly random projections: Johnson-Lindenstrauss with binary coins. Journal of Computer and System Sciences66(4), 671–687 (2003) https://doi.org/10.1016/S0022-0000(03)00025-4

work page doi:10.1016/s0022-0000(03)00025-4 2003

[43] [44]

In: International Conference on Learning Representations (ICLR) (2020)

Kitaev, N., Kaiser, L., Levskaya, A.: Reformer: The Efficient Transformer. In: International Conference on Learning Representations (ICLR) (2020)

2020

[44] [45]

In: Advances in Neural Information Processing Systems, vol

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is All you Need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

2017

[45] [46]

In: Proceedings of Machine Learning and Systems, vol

Chen, B., Medini, T., Farwell, J., Gobriel, S., Tai, C., Shrivastava, A.: SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems. In: Proceedings of Machine Learning and Systems, vol. 2, pp. 291–306 (2020)

2020

[46] [47]

In: International Conference on Learning Representations (ICLR) (2021)

Chen, B., Liu, Z., Peng, B., Xu, Z., Li, J.L., Dao, T., Song, Z., Shrivastava, A., Re, C.: MONGOOSE: A Learnable LSH Framework for Efficient Neural Net- work Training. In: International Conference on Learning Representations (ICLR) (2021)

2021

[47] [48]

ACM Trans

M¨ uller, T., Evans, A., Schied, C., Keller, A.: Instant Neural Graphics Primitives with a Multiresolution Hash Encoding. ACM Trans. Graph.41(4) (2022) https: 28 //doi.org/10.1145/3528223.3530127

work page doi:10.1145/3528223.3530127 2022

[48] [49]

2410.07299

Liu, Z., Coleman, B., Shrivastava, A.: Efficient Inference via Universal LSH Kernel. arXiv preprint arXiv:2106.11426 (2021) https://doi.org/10.48550/arXiv. 2106.11426

work page internal anchor Pith review doi:10.48550/arxiv 2021

[49] [50]

Induced and reduced unbounded operator algebras

Liu, Z., Wang, P., Li, Z.: More-Similar-Less-Important: Filter Pruning VIA Kmeans Clustering. In: IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2021). https://doi.org/10.1109/ICME51207.2021.9428286

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1109/icme51207.2021.9428286 2021

[50] [51]

Rope3D: The Roadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task,

Yin, H., Vahdat, A., Alvarez, J., Mallya, A., Kautz, J., Molchanov, P.: A- ViT: Adaptive Tokens for Efficient Vision Transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022). https://doi.org/10.1109/CVPR52688.2022.01054

work page doi:10.1109/cvpr52688.2022.01054 2022

[51] [52]

In: Advances in Neural Information Processing Systems (NeurIPS) (2021)

Rao, Y., Zhao, W., Liu, B., Lu, J., Zhou, J., Hsieh, C.-J.: DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification. In: Advances in Neural Information Processing Systems (NeurIPS) (2021)

2021

[52] [53]

In: International Conference on Learning Representations (ICLR) (2022)

Liang, Y., Ge, C., Tong, Z., Song, Y., Wang, J., Xie, P.: Not All Patches are What You Need: Expediting Vision Transformers via Token Reorganizations. In: International Conference on Learning Representations (ICLR) (2022)

2022

[53] [54]

InProceedings of the SIGGRAPH Asia 2025 Conference Papers (SA Conference Papers ’25)

Chen, M., Shao, W., Xu, P., Lin, M., Zhang, K., Chao, F., Ji, R., Qiao, Y., Luo, P.: DiffRate: Differentiable Compression Rate for Efficient Vision Transformers. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 17164– 17174 (2023). https://doi.org/10.1109/ICCV51070.2023.01574

work page doi:10.1109/iccv51070.2023.01574 2023

[54] [55]

arXiv preprint arXiv:2505.15160 (2025) https://doi.org/10

Lee, J., Choi, D.-W.: Lossless Token Merging Even Without Fine-Tuning in Vision Transformers. arXiv preprint arXiv:2505.15160 (2025) https://doi.org/10. 48550/arXiv.2505.15160

work page arXiv 2025

[55] [56]

In: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp

Kim, M., Gao, S., Hsu, Y.-C., Shen, Y., Jin, H.: Token Fusion: Bridging the Gap between Token Pruning and Token Merging. In: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 1372–1381. IEEE Computer Society, Los Alamitos, CA, USA (2024). https://doi.org/10.1109/WACV57701. 2024.00141

work page doi:10.1109/wacv57701 2024

[56] [57]

In: International Conference on Learning Representations (ICLR) (2023)

Bolya, D., Fu, C.-Y., Dai, X., Zhang, P., Feichtenhofer, C., Hoffman, J.: Token Merging: Your ViT but Faster. In: International Conference on Learning Representations (ICLR) (2023)

2023

[57] [58]

In: Proceedings of the 25th International Conference on Very Large Data Bases

Gionis, A., Indyk, P., Motwani, R.: Similarity Search in High Dimensions via Hashing. In: Proceedings of the 25th International Conference on Very Large Data Bases. VLDB ’99, pp. 518–529. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1999) 29

1999

[58] [59]

In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Li, P., Hastie, T., Church, K.: Very Sparse Random Projections. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’06, vol. 2006, pp. 287–296 (2006). https://doi.org/10.1145/ 1150402.1150436

work page arXiv 2006

[59] [60]

Univer- sity of Toronto (2009)

Krizhevsky, A.: Learning Multiple Layers of Features from Tiny Images. Univer- sity of Toronto (2009)

2009

[60] [61]

International Journal of Computer Vision 115(3), 211–252 (2015) https://doi.org/10.1007/s11263-015-0816-y

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A., Fei-Fei, L.: ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision 115(3), 211–252 (2015) https://doi.org/10.1007/s11263-015-0816-y

work page doi:10.1007/s11263-015-0816-y 2015

[61] [62]

https://doi.org/10

Phan, H.: PyTorch Models Trained on CIFAR-10 Dataset. https://doi.org/10. 5281/zenodo.4431043 . https://github.com/huyvnphan/PyTorch CIFAR10

[62] [63]

In: Advances in Neural Information Processing Systems, pp

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., K¨ opf, A., Yang, E.Z., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., Chintala, S.: PyTorch: An Imperative Style, High-Performance Deep Learning Library. In: Advances in Neural Info...

2019

[63] [64]

In: Proceedings of the British Machine Vision Conference (BMVC) (2016)

Zagoruyko, S., Komodakis, N.: Wide Residual Networks. In: Proceedings of the British Machine Vision Conference (BMVC) (2016)

2016

[64] [65]

arXiv preprint arXiv:2311.10770 (2023) https://doi.org/10.48550/arXiv.2311.10770

Belcak, P., Wattenhofer, R.: Exponentially Faster Language Modelling. arXiv preprint arXiv:2311.10770 (2023) https://doi.org/10.48550/arXiv.2311.10770

work page doi:10.48550/arxiv.2311.10770 2023

[65] [66]

In: Design, Automa- tion & Test in Europe Conference & Exhibition (DATE), pp

Vogel, S., Schorn, C., Guntoro, A., Ascheid, G.: Guaranteed Compression Rate for Activations in CNNs using a Frequency Pruning Approach. In: Design, Automa- tion & Test in Europe Conference & Exhibition (DATE), pp. 296–299 (2019). https://doi.org/10.23919/DATE.2019.8715210

work page doi:10.23919/date.2019.8715210 2019

[66] [67]

In: European Conference on Computer Vision (ECCV), pp

Zeiler, M.D., Fergus, R.: Visualizing and Understanding Convolutional Networks. In: European Conference on Computer Vision (ECCV), pp. 818–833 (2014). https: //doi.org/10.1007/978-3-319-10590-1 53

work page doi:10.1007/978-3-319-10590-1 2014

[67] [68]

arXiv preprint arXiv:2309.17211 (2023) https://doi.org/10.48550/arXiv.2309.17211 30

Meiner, L., Mehnert, J., Condurache, A.P.: Instant Complexity Reduction in CNNs using Locality-Sensitive Hashing. arXiv preprint arXiv:2309.17211 (2023) https://doi.org/10.48550/arXiv.2309.17211 30

work page doi:10.48550/arxiv.2309.17211 2023