HASTE: A Framework for Training-Free, Dynamic, and Steerable Compression of Pre-Trained Convolutional Neural Networks
Pith reviewed 2026-06-30 06:24 UTC · model grok-4.3
The pith
HASTE uses locality-sensitive hashing to merge redundant channels patch-wise in pre-trained CNNs at inference, enabling dynamic compression without any retraining or data access.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
HASTE is a plug-and-play convolution module that at inference time uses locality-sensitive hashing to identify and merge redundant channels of latent feature maps on a patch-wise basis; this simultaneously compresses the depth of both input features and their corresponding filters, resulting in computationally cheaper convolutions that require no retraining.
What carries the argument
HASTE module performing locality-sensitive hashing for patch-wise redundant channel merging that shrinks both feature-map depth and filter depth inside standard convolutions.
If this is right
- Pre-trained CNNs can be deployed on resource-limited devices by swapping in HASTE modules at inference without any additional training step.
- Compression level becomes adjustable at runtime by changing the hashing threshold or patch size.
- The same channel-merging idea applies across multiple CNN families including ResNet and others tested on both CIFAR-10 and ImageNet.
- The approach avoids any need for the original training data during the compression step.
Where Pith is reading between the lines
- The patch-wise channel merging may generalize to other layer types such as depth-wise convolutions or attention blocks if similar redundancy patterns exist.
- Runtime steering could be driven by external signals like battery level or latency targets without retraining the underlying model.
- Because the paper links the scheme to token merging in transformers, a unified merging framework across CNNs and ViTs becomes a natural next direction to test.
Load-bearing premise
Locality-sensitive hashing can reliably detect redundant channels on patches so that merging them keeps the network accurate enough without any fine-tuning.
What would settle it
A controlled run on ResNet-34 with CIFAR-10 in which HASTE produces either more than a 3 percent accuracy drop or less than 20 percent FLOPs reduction while still using the same pre-trained weights.
read the original abstract
Deploying large convolutional neural networks (CNNs) on resource-constrained devices is challenging due to their high computational cost. While dynamic execution methods are promising, existing approaches for CNNs typically require specialized training or fine-tuning, limiting their effectiveness when applied to pre-trained models and requiring data access. To address this gap, we propose HASTE (Hashing for Tractable Efficiency), a plug-and-play convolution module that enables training-free, dynamic compression of large pre-trained CNNs. At inference time, HASTE uses locality-sensitive hashing to identify and merge redundant channels of latent feature maps on a patch-wise basis. This process simultaneously compresses the depth of both input features and their corresponding filters, resulting in computationally cheaper convolutions. We conduct extensive experiments on CIFAR-10 and ImageNet across a range of architectures, demonstrating a 46.2% FLOPs reduction in a ResNet34 on CIFAR-10 with only a 1.25% drop in accuracy, without any retraining. We support our claims by comprehensive ablation studies to validate our core design choices, an analysis of the method's properties and limitations, and a discussion that connects our channel merging scheme to the conceptually related task of token merging in Vision Transformers. Our results demonstrate that HASTE provides an effective solution for steerable compression of pre-trained CNNs at runtime, opening new possibilities for the deployment of efficient deep learning methods.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes HASTE, a plug-and-play convolution module that enables training-free, dynamic, and steerable compression of pre-trained CNNs. At inference, locality-sensitive hashing identifies and merges redundant channels in latent feature maps on a patch-wise basis, simultaneously reducing the depth of input features and corresponding filters to yield cheaper convolutions. Experiments across CIFAR-10 and ImageNet on multiple architectures report e.g. a 46.2% FLOPs reduction for ResNet34 on CIFAR-10 with a 1.25% accuracy drop and no retraining; the work includes ablation studies validating design choices, analysis of method properties and limitations, and a discussion relating the channel-merging scheme to token merging in Vision Transformers.
Significance. If the empirical results hold under scrutiny, the contribution is significant because it closes a practical gap: existing dynamic CNN execution methods typically require specialized training, fine-tuning, or data access, whereas HASTE operates on frozen pre-trained models with no data or gradient updates. The steerable and dynamic nature at runtime, combined with the explicit connection to token merging, adds conceptual value. Credit is given for the extensive benchmark coverage, ablation studies, and the training-free claim being directly supported by the reported measurements rather than fitted parameters.
minor comments (3)
- [Abstract] Abstract: the phrase 'comprehensive ablation studies' would benefit from a brief enumeration of the specific design choices ablated (e.g., hash-function count, patch size, merging threshold) to allow readers to assess coverage immediately.
- The description of the LSH-based merging procedure would be clearer if the precise definition of a 'patch' (spatial support and stride) and the exact criterion for declaring two channels redundant were stated in a single equation or pseudocode block early in the methods.
- Figure captions and axis labels should explicitly state whether reported accuracy/FLOPs numbers are means over multiple random seeds or single runs; this is a minor but recurring clarity issue for reproducibility.
Simulated Author's Rebuttal
We thank the referee for the positive review, accurate summary of our contributions, and recommendation of minor revision. The significance assessment aligns with our claims regarding the training-free operation on frozen models.
Circularity Check
No significant circularity detected
full rationale
The paper introduces HASTE as an empirical plug-and-play module relying on locality-sensitive hashing for patch-wise channel merging in pre-trained CNNs. All central claims (FLOPs reduction, accuracy preservation) are supported by experimental results on CIFAR-10 and ImageNet, ablations, and architecture-specific measurements rather than any derivation, equation, or parameter fit that reduces to its own inputs by construction. No self-citation chains, uniqueness theorems, or ansatzes are invoked as load-bearing premises. The method is self-contained against external benchmarks via reported empirical outcomes.
Axiom & Free-Parameter Ledger
free parameters (1)
- LSH parameters (number of hash functions, bucket thresholds)
axioms (1)
- domain assumption Locality-sensitive hashing preserves enough similarity structure in CNN feature channels to allow safe merging
invented entities (1)
-
HASTE convolution module
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Meiner, L., Mehnert, J., Condurache, A.: Data-Free Dynamic Compression of CNNs for Tractable Efficiency. In: Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, pp. 196–208. SCITEPRESS - Science and Technology Publications, S´ etubal, Portugal (2025). https://doi.org/10.5220/001...
-
[2]
In: IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pp
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: MobileNetV2: Inverted Residuals and Linear Bottlenecks. In: IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pp. 4510–4520. IEEE Computer Society, Los Alamitos, CA, USA (2018). https://doi.org/10.1109/CVPR.2018. 00474
-
[3]
In: Pro- ceedings of the 38th International Conference on Machine Learning
Tan, M., Le, Q.: EfficientNetV2: Smaller Models and Faster Training. In: Pro- ceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 139, pp. 10096–10106. PMLR, Online (2021)
2021
-
[4]
In: Proceedings of the European Confer- ence on Computer Vision (ECCV), pp
Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. In: Proceedings of the European Confer- ence on Computer Vision (ECCV), pp. 122–138 (2018). https://doi.org/10.1007/ 978-3-030-01264-9 8 24
2018
-
[5]
In: International Conference on Learning Representations (ICLR) (2015)
Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large-Scale Image Recognition. In: International Conference on Learning Representations (ICLR) (2015)
2015
-
[6]
In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recogni- tion. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
-
[7]
In: International Conference on Learning Representations (ICLR) (2021)
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In: International Conference on Learning Representations (ICLR) (2021)
2021
-
[8]
In: International Joint Conference on Neural Networks (IJCNN), pp
Wimmer, P., Mehnert, J., Condurache, A.: COPS: Controlled Pruning Before Training Starts. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9533582
-
[9]
In: International Conference on Learning Representations (ICLR) (2016)
Han, S., Mao, H., Dally, W.J.: Deep Compression: Compressing Deep Neural Net- work with Pruning, Trained Quantization and Huffman Coding. In: International Conference on Learning Representations (ICLR) (2016)
2016
-
[10]
Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields,
Wimmer, P., Mehnert, J., Condurache, A.: Interspace Pruning: Using Adaptive Filter Representations To Improve Training of Sparse CNNs. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12527– 12537 (2022). https://doi.org/10.1109/CVPR52688.2022.01220
-
[11]
Anwar, S., Hwang, K., Sung, W.: Structured Pruning of Deep Convolutional Neural Networks. J. Emerg. Technol. Comput. Syst.13(3) (2017) https://doi. org/10.1145/3005348
-
[12]
In: International Conference on Learning Representations (ICLR) (2017)
Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H.P.: Pruning Filters for Efficient ConvNets. In: International Conference on Learning Representations (ICLR) (2017)
2017
-
[13]
Pattern Recognition115, 107899 (2021) https://doi.org/10.1016/ J.PATCOG.2021.107899
Yeom, S.-K., Seegerer, P., Lapuschkin, S., Binder, A., Wiedemann, S., M¨ uller, K.-R., Samek, W.: Pruning by explaining: A novel criterion for deep neural net- work pruning. Pattern Recognition115, 107899 (2021) https://doi.org/10.1016/ J.PATCOG.2021.107899
-
[14]
In: IEEE International Con- ference on Computer Vision (ICCV), pp
Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning Efficient Convolutional Networks through Network Slimming. In: IEEE International Con- ference on Computer Vision (ICCV), pp. 2755–2763. IEEE Computer Society, Los Alamitos, CA, USA (2017). https://doi.org/10.1109/ICCV.2017.298
-
[15]
In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp
He, Y., Liu, P., Wang, Z., Hu, Z., Yang, Y.: Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4335–4344 (2019). 25 https://doi.org/10.1109/CVPR.2019.00447
-
[16]
In: Pro- ceedings of the 32nd International Conference on Neural Information Processing Systems, pp
Zhuang, Z., Tan, M., Zhuang, B., Liu, J., Guo, Y., Wu, Q., Huang, J., Zhu, J.: Discrimination-Aware Channel Pruning for Deep Neural Networks. In: Pro- ceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 883–894 (2018)
2018
-
[17]
In: International Joint Conference on Neural Networks (IJCNN), pp
Xu, Z., Sun, J., Liu, Y., Sun, G.: An Efficient Channel-level Pruning for CNNs without Fine-tuning. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9533397
-
[18]
In: Proceedings of the Asian Conference on Computer Vision (ACCV), pp
Cakaj, R., Mehnert, J., Yang, B.: CNN Mixture-of-Depths. In: Proceedings of the Asian Conference on Computer Vision (ACCV), pp. 3480–3498 (2024). https: //doi.org/10.1007/978-981-96-0963-5 9
-
[19]
In: International Conference on Learning Representations (ICLR) (2020)
Bejnordi, B.E., Blankevoort, T., Welling, M.: Batch-Shaping for Learning Con- ditional Channel Gated Networks. In: International Conference on Learning Representations (ICLR) (2020)
2020
-
[20]
In: Advances in Neural Information Processing Systems, vol
Hua, W., Zhou, Y., De Sa, C., Zhang, Z., Suh, G.E.: Channel Gating Neural Networks. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
2019
-
[21]
In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp
Verelst, T., Tuytelaars, T.: Dynamic Convolutions: Exploiting Spatial Sparsity for Faster Inference. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2317–2326. IEEE Computer Society, Los Alamitos, CA, USA (2020). https://doi.org/10.1109/CVPR42600.2020.00239
-
[22]
Li, F., Li, G., He, X., Cheng, J.: Dynamic Dual Gating Neural Networks. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5310–5319 (2021). https://doi.org/10.1109/ICCV48922.2021.00528
-
[23]
In: International Conference on Learning Representations (ICLR) (2019)
Liu, L., Deng, L., Hu, X., Zhu, M., Li, G., Ding, Y., Xie, Y.: Dynamic Sparse Graph for Efficient Deep Learning. In: International Conference on Learning Representations (ICLR) (2019)
2019
-
[24]
Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields,
Elkerdawy, S., Elhoushi, M., Zhang, H., Ray, N.: Fire Together Wire Together: A Dynamic Pruning Approach with Self-Supervised Mask Prediction. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022). https://doi.org/10.1109/CVPR52688.2022.01213
-
[25]
arXiv preprint arXiv:2505.03254 (2025) https://doi.org/10.48550/arXiv.2505.03254
Meiner, L., Mehnert, J., Condurache, A.P.: PROM: Prioritize Reduction of Multiplications Over Lower Bit-Widths for Efficient CNNs. arXiv preprint arXiv:2505.03254 (2025) https://doi.org/10.48550/arXiv.2505.03254
-
[26]
Kim, H.-B., Park, E., Yoo, S.: BASQ: Branch-wise Activation-clipping Search Quantization for Sub-4-bit Neural Networks. In: ECCV (2022). https://doi.org/ 10.1007/978-3-031-19775-8 2 26
-
[27]
In: IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR), pp
Zhu, F., Gong, R., Yu, F., Liu, X., Wang, Y., Li, Z., Yang, X., Yan, J.: Towards Unified INT8 Training for Convolutional Neural Network. In: IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR), pp. 1966–1976 (2020)
1966
-
[28]
In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pp
Berisha, U., Mehnert, J., Condurache, A.P.: Efficient Data Driven Mixture-of- Expert Extraction from Trained Networks. In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pp. 20082–20091 (2025)
2025
-
[29]
In: International Conference on Learning Representations (ICLR) (2017)
Shazeer, N., Mirhoseini, A., Maziarz, K., Davis, A., Le, Q.V., Hinton, G.E., Dean, J.: Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. In: International Conference on Learning Representations (ICLR) (2017)
2017
-
[30]
Fedus, W., Zoph, B., Shazeer, N.: Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. J. Mach. Learn. Res.23 (2022)
2022
-
[31]
arXiv preprint arXiv:2308.14711 (2023) https://doi.org/10.48550/arXiv.2308.14711
Belcak, P., Wattenhofer, R.: Fast Feedforward Networks. arXiv preprint arXiv:2308.14711 (2023) https://doi.org/10.48550/arXiv.2308.14711
-
[33]
Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: A survey. Inter- national Journal of Computer Vision129(6), 1789–1819 (2021) https://doi.org/ 10.1007/S11263-021-01453-Z
-
[34]
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp
Dong, X., Huang, J., Yang, Y., Yan, S.: More is Less: A More Complicated Net- work with Less Inference Complexity. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1895–1903. IEEE Computer Society, Los Alamitos, CA, USA (2017). https://doi.org/10.1109/CVPR.2017.205
-
[35]
In: International Conference on Learning Representations (ICLR) (2019)
Gao, X., Zhao, Y., Dudziak, L., Mullins, R., Xu, C.-Z.: Dynamic Channel Prun- ing: Feature Boosting and Suppression. In: International Conference on Learning Representations (ICLR) (2019)
2019
-
[36]
In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp
Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H.: Learning Structured Sparsity in Deep Neural Networks. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 2082–2090 (2016)
2082
-
[37]
In: Computer Vision – ECCV 2018, pp
He, Y., Lin, J., Liu, Z., Wang, H., Li, L.-J., Han, S.: AMC: AutoML for Model Compression and Acceleration on Mobile Devices. In: Computer Vision – ECCV 2018, pp. 815–832 (2018). https://doi.org/10.1007/978-3-030-01234-2 48
-
[38]
In: 27 International Conference on Learning Representations (ICLR) (2014)
Goodfellow, I.J., Mirza, M., Da, X., Courville, A.C., Bengio, Y.: An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks. In: 27 International Conference on Learning Representations (ICLR) (2014)
2014
-
[39]
In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing
Indyk, P., Motwani, R.: Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing. STOC ’98, pp. 604–613. Association for Computing Machinery, New York, NY, USA (1998). https://doi.org/10.1145/ 276698.276876
-
[40]
In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp
Yin, H., Molchanov, P., Alvarez, J.M., Li, Z., Mallya, A., Hoiem, D., Jha, N.K., Kautz, J.: Dreaming to Distill: Data-free Knowledge Transfer via Deepin- version. In: IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion (CVPR), pp. 8715–8724 (2020). https://doi.org/10.1109/CVPR42600.2020. 00874
-
[41]
Yvinec, E., Dapogny, A., Cord, M., Bailly, K.: RED++ : Data-Free Pruning of Deep Neural Networks via Input Splitting and Output Merging. IEEE Trans- actions on Pattern Analysis & Machine Intelligence45(03), 3664–3676 (2023) https://doi.org/10.1109/TPAMI.2022.3179616
-
[42]
Bai, S., Chen, J., Shen, X., Qian, Y., Liu, Y.: Unified Data-Free Compres- sion: Pruning and Quantization without Fine-Tuning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5876–5885 (2023). https://doi.org/10.1109/ICCV51070.2023.00540
-
[43]
Journal of Computer and System Sciences , Year =
Achlioptas, D.: Database-friendly random projections: Johnson-Lindenstrauss with binary coins. Journal of Computer and System Sciences66(4), 671–687 (2003) https://doi.org/10.1016/S0022-0000(03)00025-4
-
[44]
In: International Conference on Learning Representations (ICLR) (2020)
Kitaev, N., Kaiser, L., Levskaya, A.: Reformer: The Efficient Transformer. In: International Conference on Learning Representations (ICLR) (2020)
2020
-
[45]
In: Advances in Neural Information Processing Systems, vol
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is All you Need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
2017
-
[46]
In: Proceedings of Machine Learning and Systems, vol
Chen, B., Medini, T., Farwell, J., Gobriel, S., Tai, C., Shrivastava, A.: SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems. In: Proceedings of Machine Learning and Systems, vol. 2, pp. 291–306 (2020)
2020
-
[47]
In: International Conference on Learning Representations (ICLR) (2021)
Chen, B., Liu, Z., Peng, B., Xu, Z., Li, J.L., Dao, T., Song, Z., Shrivastava, A., Re, C.: MONGOOSE: A Learnable LSH Framework for Efficient Neural Net- work Training. In: International Conference on Learning Representations (ICLR) (2021)
2021
-
[48]
M¨ uller, T., Evans, A., Schied, C., Keller, A.: Instant Neural Graphics Primitives with a Multiresolution Hash Encoding. ACM Trans. Graph.41(4) (2022) https: 28 //doi.org/10.1145/3528223.3530127
-
[49]
Liu, Z., Coleman, B., Shrivastava, A.: Efficient Inference via Universal LSH Kernel. arXiv preprint arXiv:2106.11426 (2021) https://doi.org/10.48550/arXiv. 2106.11426
work page internal anchor Pith review doi:10.48550/arxiv 2021
-
[50]
Induced and reduced unbounded operator algebras
Liu, Z., Wang, P., Li, Z.: More-Similar-Less-Important: Filter Pruning VIA Kmeans Clustering. In: IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2021). https://doi.org/10.1109/ICME51207.2021.9428286
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1109/icme51207.2021.9428286 2021
-
[51]
Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields,
Yin, H., Vahdat, A., Alvarez, J., Mallya, A., Kautz, J., Molchanov, P.: A- ViT: Adaptive Tokens for Efficient Vision Transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022). https://doi.org/10.1109/CVPR52688.2022.01054
-
[52]
In: Advances in Neural Information Processing Systems (NeurIPS) (2021)
Rao, Y., Zhao, W., Liu, B., Lu, J., Zhou, J., Hsieh, C.-J.: DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification. In: Advances in Neural Information Processing Systems (NeurIPS) (2021)
2021
-
[53]
In: International Conference on Learning Representations (ICLR) (2022)
Liang, Y., Ge, C., Tong, Z., Song, Y., Wang, J., Xie, P.: Not All Patches are What You Need: Expediting Vision Transformers via Token Reorganizations. In: International Conference on Learning Representations (ICLR) (2022)
2022
-
[54]
Chen, M., Shao, W., Xu, P., Lin, M., Zhang, K., Chao, F., Ji, R., Qiao, Y., Luo, P.: DiffRate: Differentiable Compression Rate for Efficient Vision Transformers. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 17164– 17174 (2023). https://doi.org/10.1109/ICCV51070.2023.01574
-
[55]
arXiv preprint arXiv:2505.15160 (2025) https://doi.org/10
Lee, J., Choi, D.-W.: Lossless Token Merging Even Without Fine-Tuning in Vision Transformers. arXiv preprint arXiv:2505.15160 (2025) https://doi.org/10. 48550/arXiv.2505.15160
-
[56]
In: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp
Kim, M., Gao, S., Hsu, Y.-C., Shen, Y., Jin, H.: Token Fusion: Bridging the Gap between Token Pruning and Token Merging. In: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 1372–1381. IEEE Computer Society, Los Alamitos, CA, USA (2024). https://doi.org/10.1109/WACV57701. 2024.00141
-
[57]
In: International Conference on Learning Representations (ICLR) (2023)
Bolya, D., Fu, C.-Y., Dai, X., Zhang, P., Feichtenhofer, C., Hoffman, J.: Token Merging: Your ViT but Faster. In: International Conference on Learning Representations (ICLR) (2023)
2023
-
[58]
In: Proceedings of the 25th International Conference on Very Large Data Bases
Gionis, A., Indyk, P., Motwani, R.: Similarity Search in High Dimensions via Hashing. In: Proceedings of the 25th International Conference on Very Large Data Bases. VLDB ’99, pp. 518–529. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1999) 29
1999
-
[59]
Li, P., Hastie, T., Church, K.: Very Sparse Random Projections. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’06, vol. 2006, pp. 287–296 (2006). https://doi.org/10.1145/ 1150402.1150436
-
[60]
Univer- sity of Toronto (2009)
Krizhevsky, A.: Learning Multiple Layers of Features from Tiny Images. Univer- sity of Toronto (2009)
2009
-
[61]
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A., Fei-Fei, L.: ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision 115(3), 211–252 (2015) https://doi.org/10.1007/s11263-015-0816-y
-
[62]
https://doi.org/10
Phan, H.: PyTorch Models Trained on CIFAR-10 Dataset. https://doi.org/10. 5281/zenodo.4431043 . https://github.com/huyvnphan/PyTorch CIFAR10
-
[63]
In: Advances in Neural Information Processing Systems, pp
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., K¨ opf, A., Yang, E.Z., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., Chintala, S.: PyTorch: An Imperative Style, High-Performance Deep Learning Library. In: Advances in Neural Info...
2019
-
[64]
In: Proceedings of the British Machine Vision Conference (BMVC) (2016)
Zagoruyko, S., Komodakis, N.: Wide Residual Networks. In: Proceedings of the British Machine Vision Conference (BMVC) (2016)
2016
-
[65]
arXiv preprint arXiv:2311.10770 (2023) https://doi.org/10.48550/arXiv.2311.10770
Belcak, P., Wattenhofer, R.: Exponentially Faster Language Modelling. arXiv preprint arXiv:2311.10770 (2023) https://doi.org/10.48550/arXiv.2311.10770
-
[66]
In: Design, Automa- tion & Test in Europe Conference & Exhibition (DATE), pp
Vogel, S., Schorn, C., Guntoro, A., Ascheid, G.: Guaranteed Compression Rate for Activations in CNNs using a Frequency Pruning Approach. In: Design, Automa- tion & Test in Europe Conference & Exhibition (DATE), pp. 296–299 (2019). https://doi.org/10.23919/DATE.2019.8715210
-
[67]
In: European Conference on Computer Vision (ECCV), pp
Zeiler, M.D., Fergus, R.: Visualizing and Understanding Convolutional Networks. In: European Conference on Computer Vision (ECCV), pp. 818–833 (2014). https: //doi.org/10.1007/978-3-319-10590-1 53
-
[68]
arXiv preprint arXiv:2309.17211 (2023) https://doi.org/10.48550/arXiv.2309.17211 30
Meiner, L., Mehnert, J., Condurache, A.P.: Instant Complexity Reduction in CNNs using Locality-Sensitive Hashing. arXiv preprint arXiv:2309.17211 (2023) https://doi.org/10.48550/arXiv.2309.17211 30
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.