A slice-wise feature distillation framework for independent tensorization of neural network slices to achieve scalable compression with reduced fine-tuning costs.
Zeroquant: Efficient and affordable post-training quantization for large-scale transformers.Advances in Neural Information Processing Systems, 35:27168–27183, 2022
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
TStore reduces AI model storage via tensor-level fingerprinting, clustering, and compression without annotations while claiming to preserve usability.
citing papers explorer
-
Fast Tensorization of Neural Networks via Slice-wise Feature Distillation
A slice-wise feature distillation framework for independent tensorization of neural network slices to achieve scalable compression with reduced fine-tuning costs.
-
TStore: Rethinking AI Model Hub with Tensor-Centric Compression
TStore reduces AI model storage via tensor-level fingerprinting, clustering, and compression without annotations while claiming to preserve usability.