pith. sign in

arxiv: 2606.30516 · v1 · pith:FRSP3A5Znew · submitted 2026-06-29 · 💻 cs.CV

HASTE: A Framework for Training-Free, Dynamic, and Steerable Compression of Pre-Trained Convolutional Neural Networks

Pith reviewed 2026-06-30 06:24 UTC · model grok-4.3

classification 💻 cs.CV
keywords CNN compressiontraining-free inferencedynamic compressionlocality-sensitive hashingchannel mergingResNetCIFAR-10ImageNet
0
0 comments X

The pith

HASTE uses locality-sensitive hashing to merge redundant channels patch-wise in pre-trained CNNs at inference, enabling dynamic compression without any retraining or data access.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces HASTE as a plug-and-play module that inserts into existing convolutional layers of pre-trained networks. At runtime it applies locality-sensitive hashing to feature-map patches to detect and merge similar channels, which shrinks both the input depth and the matching filter weights for cheaper matrix multiplications. Experiments report that this yields up to 46.2 percent fewer FLOPs on ResNet-34 with CIFAR-10 while losing only 1.25 percent accuracy, and the same pattern holds across several architectures on ImageNet. A reader would care because the method removes the usual requirements for fine-tuning or even access to the original training set, making large models runnable on constrained hardware without extra training cost.

Core claim

HASTE is a plug-and-play convolution module that at inference time uses locality-sensitive hashing to identify and merge redundant channels of latent feature maps on a patch-wise basis; this simultaneously compresses the depth of both input features and their corresponding filters, resulting in computationally cheaper convolutions that require no retraining.

What carries the argument

HASTE module performing locality-sensitive hashing for patch-wise redundant channel merging that shrinks both feature-map depth and filter depth inside standard convolutions.

If this is right

  • Pre-trained CNNs can be deployed on resource-limited devices by swapping in HASTE modules at inference without any additional training step.
  • Compression level becomes adjustable at runtime by changing the hashing threshold or patch size.
  • The same channel-merging idea applies across multiple CNN families including ResNet and others tested on both CIFAR-10 and ImageNet.
  • The approach avoids any need for the original training data during the compression step.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The patch-wise channel merging may generalize to other layer types such as depth-wise convolutions or attention blocks if similar redundancy patterns exist.
  • Runtime steering could be driven by external signals like battery level or latency targets without retraining the underlying model.
  • Because the paper links the scheme to token merging in transformers, a unified merging framework across CNNs and ViTs becomes a natural next direction to test.

Load-bearing premise

Locality-sensitive hashing can reliably detect redundant channels on patches so that merging them keeps the network accurate enough without any fine-tuning.

What would settle it

A controlled run on ResNet-34 with CIFAR-10 in which HASTE produces either more than a 3 percent accuracy drop or less than 20 percent FLOPs reduction while still using the same pre-trained weights.

read the original abstract

Deploying large convolutional neural networks (CNNs) on resource-constrained devices is challenging due to their high computational cost. While dynamic execution methods are promising, existing approaches for CNNs typically require specialized training or fine-tuning, limiting their effectiveness when applied to pre-trained models and requiring data access. To address this gap, we propose HASTE (Hashing for Tractable Efficiency), a plug-and-play convolution module that enables training-free, dynamic compression of large pre-trained CNNs. At inference time, HASTE uses locality-sensitive hashing to identify and merge redundant channels of latent feature maps on a patch-wise basis. This process simultaneously compresses the depth of both input features and their corresponding filters, resulting in computationally cheaper convolutions. We conduct extensive experiments on CIFAR-10 and ImageNet across a range of architectures, demonstrating a 46.2% FLOPs reduction in a ResNet34 on CIFAR-10 with only a 1.25% drop in accuracy, without any retraining. We support our claims by comprehensive ablation studies to validate our core design choices, an analysis of the method's properties and limitations, and a discussion that connects our channel merging scheme to the conceptually related task of token merging in Vision Transformers. Our results demonstrate that HASTE provides an effective solution for steerable compression of pre-trained CNNs at runtime, opening new possibilities for the deployment of efficient deep learning methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper proposes HASTE, a plug-and-play convolution module that enables training-free, dynamic, and steerable compression of pre-trained CNNs. At inference, locality-sensitive hashing identifies and merges redundant channels in latent feature maps on a patch-wise basis, simultaneously reducing the depth of input features and corresponding filters to yield cheaper convolutions. Experiments across CIFAR-10 and ImageNet on multiple architectures report e.g. a 46.2% FLOPs reduction for ResNet34 on CIFAR-10 with a 1.25% accuracy drop and no retraining; the work includes ablation studies validating design choices, analysis of method properties and limitations, and a discussion relating the channel-merging scheme to token merging in Vision Transformers.

Significance. If the empirical results hold under scrutiny, the contribution is significant because it closes a practical gap: existing dynamic CNN execution methods typically require specialized training, fine-tuning, or data access, whereas HASTE operates on frozen pre-trained models with no data or gradient updates. The steerable and dynamic nature at runtime, combined with the explicit connection to token merging, adds conceptual value. Credit is given for the extensive benchmark coverage, ablation studies, and the training-free claim being directly supported by the reported measurements rather than fitted parameters.

minor comments (3)
  1. [Abstract] Abstract: the phrase 'comprehensive ablation studies' would benefit from a brief enumeration of the specific design choices ablated (e.g., hash-function count, patch size, merging threshold) to allow readers to assess coverage immediately.
  2. The description of the LSH-based merging procedure would be clearer if the precise definition of a 'patch' (spatial support and stride) and the exact criterion for declaring two channels redundant were stated in a single equation or pseudocode block early in the methods.
  3. Figure captions and axis labels should explicitly state whether reported accuracy/FLOPs numbers are means over multiple random seeds or single runs; this is a minor but recurring clarity issue for reproducibility.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive review, accurate summary of our contributions, and recommendation of minor revision. The significance assessment aligns with our claims regarding the training-free operation on frozen models.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper introduces HASTE as an empirical plug-and-play module relying on locality-sensitive hashing for patch-wise channel merging in pre-trained CNNs. All central claims (FLOPs reduction, accuracy preservation) are supported by experimental results on CIFAR-10 and ImageNet, ablations, and architecture-specific measurements rather than any derivation, equation, or parameter fit that reduces to its own inputs by construction. No self-citation chains, uniqueness theorems, or ansatzes are invoked as load-bearing premises. The method is self-contained against external benchmarks via reported empirical outcomes.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The central claim depends on the empirical effectiveness of LSH for identifying mergeable channels; no explicit free parameters or invented physical entities are stated in the abstract.

free parameters (1)
  • LSH parameters (number of hash functions, bucket thresholds)
    Control similarity detection and merging decisions; values not reported in abstract.
axioms (1)
  • domain assumption Locality-sensitive hashing preserves enough similarity structure in CNN feature channels to allow safe merging
    Invoked by the core merging step described in the abstract.
invented entities (1)
  • HASTE convolution module no independent evidence
    purpose: Plug-and-play component that performs dynamic channel merging
    New module introduced by the paper; no independent evidence outside the reported experiments.

pith-pipeline@v0.9.1-grok · 5797 in / 1310 out tokens · 38166 ms · 2026-06-30T06:24:54.973335+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

67 extracted references · 38 canonical work pages · 2 internal anchors

  1. [1]

    In: Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, pp

    Meiner, L., Mehnert, J., Condurache, A.: Data-Free Dynamic Compression of CNNs for Tractable Efficiency. In: Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, pp. 196–208. SCITEPRESS - Science and Technology Publications, S´ etubal, Portugal (2025). https://doi.org/10.5220/001...

  2. [2]

    In: IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pp

    Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: MobileNetV2: Inverted Residuals and Linear Bottlenecks. In: IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pp. 4510–4520. IEEE Computer Society, Los Alamitos, CA, USA (2018). https://doi.org/10.1109/CVPR.2018. 00474

  3. [3]

    In: Pro- ceedings of the 38th International Conference on Machine Learning

    Tan, M., Le, Q.: EfficientNetV2: Smaller Models and Faster Training. In: Pro- ceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 139, pp. 10096–10106. PMLR, Online (2021)

  4. [4]

    In: Proceedings of the European Confer- ence on Computer Vision (ECCV), pp

    Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. In: Proceedings of the European Confer- ence on Computer Vision (ECCV), pp. 122–138 (2018). https://doi.org/10.1007/ 978-3-030-01264-9 8 24

  5. [5]

    In: International Conference on Learning Representations (ICLR) (2015)

    Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large-Scale Image Recognition. In: International Conference on Learning Representations (ICLR) (2015)

  6. [6]

    In: Proc

    He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recogni- tion. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90

  7. [7]

    In: International Conference on Learning Representations (ICLR) (2021)

    Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In: International Conference on Learning Representations (ICLR) (2021)

  8. [8]

    In: International Joint Conference on Neural Networks (IJCNN), pp

    Wimmer, P., Mehnert, J., Condurache, A.: COPS: Controlled Pruning Before Training Starts. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9533582

  9. [9]

    In: International Conference on Learning Representations (ICLR) (2016)

    Han, S., Mao, H., Dally, W.J.: Deep Compression: Compressing Deep Neural Net- work with Pruning, Trained Quantization and Huffman Coding. In: International Conference on Learning Representations (ICLR) (2016)

  10. [10]

    Rope3D: The Roadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task,

    Wimmer, P., Mehnert, J., Condurache, A.: Interspace Pruning: Using Adaptive Filter Representations To Improve Training of Sparse CNNs. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12527– 12537 (2022). https://doi.org/10.1109/CVPR52688.2022.01220

  11. [11]

    Anwar, S., Hwang, K., Sung, W.: Structured Pruning of Deep Convolutional Neural Networks. J. Emerg. Technol. Comput. Syst.13(3) (2017) https://doi. org/10.1145/3005348

  12. [12]

    In: International Conference on Learning Representations (ICLR) (2017)

    Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H.P.: Pruning Filters for Efficient ConvNets. In: International Conference on Learning Representations (ICLR) (2017)

  13. [13]

    Pattern Recognition115, 107899 (2021) https://doi.org/10.1016/ J.PATCOG.2021.107899

    Yeom, S.-K., Seegerer, P., Lapuschkin, S., Binder, A., Wiedemann, S., M¨ uller, K.-R., Samek, W.: Pruning by explaining: A novel criterion for deep neural net- work pruning. Pattern Recognition115, 107899 (2021) https://doi.org/10.1016/ J.PATCOG.2021.107899

  14. [14]

    In: IEEE International Con- ference on Computer Vision (ICCV), pp

    Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning Efficient Convolutional Networks through Network Slimming. In: IEEE International Con- ference on Computer Vision (ICCV), pp. 2755–2763. IEEE Computer Society, Los Alamitos, CA, USA (2017). https://doi.org/10.1109/ICCV.2017.298

  15. [15]

    In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

    He, Y., Liu, P., Wang, Z., Hu, Z., Yang, Y.: Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4335–4344 (2019). 25 https://doi.org/10.1109/CVPR.2019.00447

  16. [16]

    In: Pro- ceedings of the 32nd International Conference on Neural Information Processing Systems, pp

    Zhuang, Z., Tan, M., Zhuang, B., Liu, J., Guo, Y., Wu, Q., Huang, J., Zhu, J.: Discrimination-Aware Channel Pruning for Deep Neural Networks. In: Pro- ceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 883–894 (2018)

  17. [17]

    In: International Joint Conference on Neural Networks (IJCNN), pp

    Xu, Z., Sun, J., Liu, Y., Sun, G.: An Efficient Channel-level Pruning for CNNs without Fine-tuning. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9533397

  18. [18]

    In: Proceedings of the Asian Conference on Computer Vision (ACCV), pp

    Cakaj, R., Mehnert, J., Yang, B.: CNN Mixture-of-Depths. In: Proceedings of the Asian Conference on Computer Vision (ACCV), pp. 3480–3498 (2024). https: //doi.org/10.1007/978-981-96-0963-5 9

  19. [19]

    In: International Conference on Learning Representations (ICLR) (2020)

    Bejnordi, B.E., Blankevoort, T., Welling, M.: Batch-Shaping for Learning Con- ditional Channel Gated Networks. In: International Conference on Learning Representations (ICLR) (2020)

  20. [20]

    In: Advances in Neural Information Processing Systems, vol

    Hua, W., Zhou, Y., De Sa, C., Zhang, Z., Suh, G.E.: Channel Gating Neural Networks. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

  21. [21]

    URL https : / / openaccess

    Verelst, T., Tuytelaars, T.: Dynamic Convolutions: Exploiting Spatial Sparsity for Faster Inference. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2317–2326. IEEE Computer Society, Los Alamitos, CA, USA (2020). https://doi.org/10.1109/CVPR42600.2020.00239

  22. [22]

    Caron, H

    Li, F., Li, G., He, X., Cheng, J.: Dynamic Dual Gating Neural Networks. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5310–5319 (2021). https://doi.org/10.1109/ICCV48922.2021.00528

  23. [23]

    In: International Conference on Learning Representations (ICLR) (2019)

    Liu, L., Deng, L., Hu, X., Zhu, M., Li, G., Ding, Y., Xie, Y.: Dynamic Sparse Graph for Efficient Deep Learning. In: International Conference on Learning Representations (ICLR) (2019)

  24. [24]

    Rope3D: The Roadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task,

    Elkerdawy, S., Elhoushi, M., Zhang, H., Ray, N.: Fire Together Wire Together: A Dynamic Pruning Approach with Self-Supervised Mask Prediction. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022). https://doi.org/10.1109/CVPR52688.2022.01213

  25. [25]

    arXiv preprint arXiv:2505.03254 (2025) https://doi.org/10.48550/arXiv.2505.03254

    Meiner, L., Mehnert, J., Condurache, A.P.: PROM: Prioritize Reduction of Multiplications Over Lower Bit-Widths for Efficient CNNs. arXiv preprint arXiv:2505.03254 (2025) https://doi.org/10.48550/arXiv.2505.03254

  26. [26]

    In: ECCV (2022)

    Kim, H.-B., Park, E., Yoo, S.: BASQ: Branch-wise Activation-clipping Search Quantization for Sub-4-bit Neural Networks. In: ECCV (2022). https://doi.org/ 10.1007/978-3-031-19775-8 2 26

  27. [27]

    In: IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR), pp

    Zhu, F., Gong, R., Yu, F., Liu, X., Wang, Y., Li, Z., Yang, X., Yan, J.: Towards Unified INT8 Training for Convolutional Neural Network. In: IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR), pp. 1966–1976 (2020)

  28. [28]

    In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pp

    Berisha, U., Mehnert, J., Condurache, A.P.: Efficient Data Driven Mixture-of- Expert Extraction from Trained Networks. In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pp. 20082–20091 (2025)

  29. [29]

    In: International Conference on Learning Representations (ICLR) (2017)

    Shazeer, N., Mirhoseini, A., Maziarz, K., Davis, A., Le, Q.V., Hinton, G.E., Dean, J.: Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. In: International Conference on Learning Representations (ICLR) (2017)

  30. [30]

    Fedus, W., Zoph, B., Shazeer, N.: Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. J. Mach. Learn. Res.23 (2022)

  31. [31]

    arXiv preprint arXiv:2308.14711 (2023) https://doi.org/10.48550/arXiv.2308.14711

    Belcak, P., Wattenhofer, R.: Fast Feedforward Networks. arXiv preprint arXiv:2308.14711 (2023) https://doi.org/10.48550/arXiv.2308.14711

  32. [33]

    Inter- national Journal of Computer Vision129(6), 1789–1819 (2021) https://doi.org/ 10.1007/S11263-021-01453-Z

    Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: A survey. Inter- national Journal of Computer Vision129(6), 1789–1819 (2021) https://doi.org/ 10.1007/S11263-021-01453-Z

  33. [34]

    In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp

    Dong, X., Huang, J., Yang, Y., Yan, S.: More is Less: A More Complicated Net- work with Less Inference Complexity. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1895–1903. IEEE Computer Society, Los Alamitos, CA, USA (2017). https://doi.org/10.1109/CVPR.2017.205

  34. [35]

    In: International Conference on Learning Representations (ICLR) (2019)

    Gao, X., Zhao, Y., Dudziak, L., Mullins, R., Xu, C.-Z.: Dynamic Channel Prun- ing: Feature Boosting and Suppression. In: International Conference on Learning Representations (ICLR) (2019)

  35. [36]

    In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp

    Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H.: Learning Structured Sparsity in Deep Neural Networks. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 2082–2090 (2016)

  36. [37]

    In: Computer Vision – ECCV 2018, pp

    He, Y., Lin, J., Liu, Z., Wang, H., Li, L.-J., Han, S.: AMC: AutoML for Model Compression and Acceleration on Mobile Devices. In: Computer Vision – ECCV 2018, pp. 815–832 (2018). https://doi.org/10.1007/978-3-030-01234-2 48

  37. [38]

    In: 27 International Conference on Learning Representations (ICLR) (2014)

    Goodfellow, I.J., Mirza, M., Da, X., Courville, A.C., Bengio, Y.: An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks. In: 27 International Conference on Learning Representations (ICLR) (2014)

  38. [39]

    In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing

    Indyk, P., Motwani, R.: Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing. STOC ’98, pp. 604–613. Association for Computing Machinery, New York, NY, USA (1998). https://doi.org/10.1145/ 276698.276876

  39. [40]

    URL https : / / openaccess

    Yin, H., Molchanov, P., Alvarez, J.M., Li, Z., Mallya, A., Hoiem, D., Jha, N.K., Kautz, J.: Dreaming to Distill: Data-free Knowledge Transfer via Deepin- version. In: IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion (CVPR), pp. 8715–8724 (2020). https://doi.org/10.1109/CVPR42600.2020. 00874

  40. [41]

    IEEE Trans- actions on Pattern Analysis & Machine Intelligence45(03), 3664–3676 (2023) https://doi.org/10.1109/TPAMI.2022.3179616

    Yvinec, E., Dapogny, A., Cord, M., Bailly, K.: RED++ : Data-Free Pruning of Deep Neural Networks via Input Splitting and Output Merging. IEEE Trans- actions on Pattern Analysis & Machine Intelligence45(03), 3664–3676 (2023) https://doi.org/10.1109/TPAMI.2022.3179616

  41. [42]

    InProceedings of the SIGGRAPH Asia 2025 Conference Papers (SA Conference Papers ’25)

    Bai, S., Chen, J., Shen, X., Qian, Y., Liu, Y.: Unified Data-Free Compres- sion: Pruning and Quantization without Fine-Tuning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5876–5885 (2023). https://doi.org/10.1109/ICCV51070.2023.00540

  42. [43]

    Journal of Computer and System Sciences , Year =

    Achlioptas, D.: Database-friendly random projections: Johnson-Lindenstrauss with binary coins. Journal of Computer and System Sciences66(4), 671–687 (2003) https://doi.org/10.1016/S0022-0000(03)00025-4

  43. [44]

    In: International Conference on Learning Representations (ICLR) (2020)

    Kitaev, N., Kaiser, L., Levskaya, A.: Reformer: The Efficient Transformer. In: International Conference on Learning Representations (ICLR) (2020)

  44. [45]

    In: Advances in Neural Information Processing Systems, vol

    Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is All you Need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

  45. [46]

    In: Proceedings of Machine Learning and Systems, vol

    Chen, B., Medini, T., Farwell, J., Gobriel, S., Tai, C., Shrivastava, A.: SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems. In: Proceedings of Machine Learning and Systems, vol. 2, pp. 291–306 (2020)

  46. [47]

    In: International Conference on Learning Representations (ICLR) (2021)

    Chen, B., Liu, Z., Peng, B., Xu, Z., Li, J.L., Dao, T., Song, Z., Shrivastava, A., Re, C.: MONGOOSE: A Learnable LSH Framework for Efficient Neural Net- work Training. In: International Conference on Learning Representations (ICLR) (2021)

  47. [48]

    ACM Trans

    M¨ uller, T., Evans, A., Schied, C., Keller, A.: Instant Neural Graphics Primitives with a Multiresolution Hash Encoding. ACM Trans. Graph.41(4) (2022) https: 28 //doi.org/10.1145/3528223.3530127

  48. [49]

    2410.07299

    Liu, Z., Coleman, B., Shrivastava, A.: Efficient Inference via Universal LSH Kernel. arXiv preprint arXiv:2106.11426 (2021) https://doi.org/10.48550/arXiv. 2106.11426

  49. [50]

    Induced and reduced unbounded operator algebras

    Liu, Z., Wang, P., Li, Z.: More-Similar-Less-Important: Filter Pruning VIA Kmeans Clustering. In: IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2021). https://doi.org/10.1109/ICME51207.2021.9428286

  50. [51]

    Rope3D: The Roadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task,

    Yin, H., Vahdat, A., Alvarez, J., Mallya, A., Kautz, J., Molchanov, P.: A- ViT: Adaptive Tokens for Efficient Vision Transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022). https://doi.org/10.1109/CVPR52688.2022.01054

  51. [52]

    In: Advances in Neural Information Processing Systems (NeurIPS) (2021)

    Rao, Y., Zhao, W., Liu, B., Lu, J., Zhou, J., Hsieh, C.-J.: DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification. In: Advances in Neural Information Processing Systems (NeurIPS) (2021)

  52. [53]

    In: International Conference on Learning Representations (ICLR) (2022)

    Liang, Y., Ge, C., Tong, Z., Song, Y., Wang, J., Xie, P.: Not All Patches are What You Need: Expediting Vision Transformers via Token Reorganizations. In: International Conference on Learning Representations (ICLR) (2022)

  53. [54]

    InProceedings of the SIGGRAPH Asia 2025 Conference Papers (SA Conference Papers ’25)

    Chen, M., Shao, W., Xu, P., Lin, M., Zhang, K., Chao, F., Ji, R., Qiao, Y., Luo, P.: DiffRate: Differentiable Compression Rate for Efficient Vision Transformers. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 17164– 17174 (2023). https://doi.org/10.1109/ICCV51070.2023.01574

  54. [55]

    arXiv preprint arXiv:2505.15160 (2025) https://doi.org/10

    Lee, J., Choi, D.-W.: Lossless Token Merging Even Without Fine-Tuning in Vision Transformers. arXiv preprint arXiv:2505.15160 (2025) https://doi.org/10. 48550/arXiv.2505.15160

  55. [56]

    In: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp

    Kim, M., Gao, S., Hsu, Y.-C., Shen, Y., Jin, H.: Token Fusion: Bridging the Gap between Token Pruning and Token Merging. In: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 1372–1381. IEEE Computer Society, Los Alamitos, CA, USA (2024). https://doi.org/10.1109/WACV57701. 2024.00141

  56. [57]

    In: International Conference on Learning Representations (ICLR) (2023)

    Bolya, D., Fu, C.-Y., Dai, X., Zhang, P., Feichtenhofer, C., Hoffman, J.: Token Merging: Your ViT but Faster. In: International Conference on Learning Representations (ICLR) (2023)

  57. [58]

    In: Proceedings of the 25th International Conference on Very Large Data Bases

    Gionis, A., Indyk, P., Motwani, R.: Similarity Search in High Dimensions via Hashing. In: Proceedings of the 25th International Conference on Very Large Data Bases. VLDB ’99, pp. 518–529. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1999) 29

  58. [59]

    In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

    Li, P., Hastie, T., Church, K.: Very Sparse Random Projections. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’06, vol. 2006, pp. 287–296 (2006). https://doi.org/10.1145/ 1150402.1150436

  59. [60]

    Univer- sity of Toronto (2009)

    Krizhevsky, A.: Learning Multiple Layers of Features from Tiny Images. Univer- sity of Toronto (2009)

  60. [61]

    International Journal of Computer Vision 115(3), 211–252 (2015) https://doi.org/10.1007/s11263-015-0816-y

    Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A., Fei-Fei, L.: ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision 115(3), 211–252 (2015) https://doi.org/10.1007/s11263-015-0816-y

  61. [62]

    https://doi.org/10

    Phan, H.: PyTorch Models Trained on CIFAR-10 Dataset. https://doi.org/10. 5281/zenodo.4431043 . https://github.com/huyvnphan/PyTorch CIFAR10

  62. [63]

    In: Advances in Neural Information Processing Systems, pp

    Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., K¨ opf, A., Yang, E.Z., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., Chintala, S.: PyTorch: An Imperative Style, High-Performance Deep Learning Library. In: Advances in Neural Info...

  63. [64]

    In: Proceedings of the British Machine Vision Conference (BMVC) (2016)

    Zagoruyko, S., Komodakis, N.: Wide Residual Networks. In: Proceedings of the British Machine Vision Conference (BMVC) (2016)

  64. [65]

    arXiv preprint arXiv:2311.10770 (2023) https://doi.org/10.48550/arXiv.2311.10770

    Belcak, P., Wattenhofer, R.: Exponentially Faster Language Modelling. arXiv preprint arXiv:2311.10770 (2023) https://doi.org/10.48550/arXiv.2311.10770

  65. [66]

    In: Design, Automa- tion & Test in Europe Conference & Exhibition (DATE), pp

    Vogel, S., Schorn, C., Guntoro, A., Ascheid, G.: Guaranteed Compression Rate for Activations in CNNs using a Frequency Pruning Approach. In: Design, Automa- tion & Test in Europe Conference & Exhibition (DATE), pp. 296–299 (2019). https://doi.org/10.23919/DATE.2019.8715210

  66. [67]

    In: European Conference on Computer Vision (ECCV), pp

    Zeiler, M.D., Fergus, R.: Visualizing and Understanding Convolutional Networks. In: European Conference on Computer Vision (ECCV), pp. 818–833 (2014). https: //doi.org/10.1007/978-3-319-10590-1 53

  67. [68]

    arXiv preprint arXiv:2309.17211 (2023) https://doi.org/10.48550/arXiv.2309.17211 30

    Meiner, L., Mehnert, J., Condurache, A.P.: Instant Complexity Reduction in CNNs using Locality-Sensitive Hashing. arXiv preprint arXiv:2309.17211 (2023) https://doi.org/10.48550/arXiv.2309.17211 30