Zipnn: Lossless compression for ai models

Moshik Hershcovitch, Andrew Wood, Leshem Choshen, Guy Girmonsky, Roy Leibovitz, Ilias Ennmouri, Michal Malka, Peter Chin, Swaminathan Sundararaman, Danny Harnik · 2024 · arXiv 2411.05239

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

SplitZip: Ultra Fast Lossless KV Compression for Disaggregated LLM Serving

cs.DC · 2026-05-03 · unverdicted · novelty 7.0 · 2 refs

SplitZip is a new GPU-friendly lossless compressor for KV cache tensors that exploits exponent redundancy to achieve over 600 GB/s compression throughput and up to 1.32x faster transfers in disaggregated LLM serving.

ZipCCL: Efficient Lossless Data Compression of Communication Collectives for Accelerating LLM Training

cs.DC · 2026-04-30 · unverdicted · novelty 5.0

ZipCCL delivers up to 1.35x faster communication and 1.18x end-to-end speedup in LLM training through lossless compression of near-Gaussian collectives on 64-GPU clusters.

Distributed Generative Inference of LLM at Internet Scales with Multi-Dimensional Communication Optimization

cs.DC · 2026-04-22 · unverdicted · novelty 5.0

BloomBee is a distributed LLM inference system that achieves up to 1.76x higher throughput and 43.2% lower latency than prior decentralized systems by optimizing communication across multiple dimensions in low-bandwidth internet settings.

TStore: Rethinking AI Model Hub with Tensor-Centric Compression

cs.DC · 2026-04-18 · unverdicted · novelty 5.0 · 2 refs

TStore reduces AI model storage via tensor-level fingerprinting, clustering, and compression without annotations while claiming to preserve usability.

citing papers explorer

Showing 4 of 4 citing papers.

SplitZip: Ultra Fast Lossless KV Compression for Disaggregated LLM Serving cs.DC · 2026-05-03 · unverdicted · none · ref 9 · 2 links
SplitZip is a new GPU-friendly lossless compressor for KV cache tensors that exploits exponent redundancy to achieve over 600 GB/s compression throughput and up to 1.32x faster transfers in disaggregated LLM serving.
ZipCCL: Efficient Lossless Data Compression of Communication Collectives for Accelerating LLM Training cs.DC · 2026-04-30 · unverdicted · none · ref 11
ZipCCL delivers up to 1.35x faster communication and 1.18x end-to-end speedup in LLM training through lossless compression of near-Gaussian collectives on 64-GPU clusters.
Distributed Generative Inference of LLM at Internet Scales with Multi-Dimensional Communication Optimization cs.DC · 2026-04-22 · unverdicted · none · ref 15
BloomBee is a distributed LLM inference system that achieves up to 1.76x higher throughput and 43.2% lower latency than prior decentralized systems by optimizing communication across multiple dimensions in low-bandwidth internet settings.
TStore: Rethinking AI Model Hub with Tensor-Centric Compression cs.DC · 2026-04-18 · unverdicted · none · ref 39 · 2 links
TStore reduces AI model storage via tensor-level fingerprinting, clustering, and compression without annotations while claiming to preserve usability.

Zipnn: Lossless compression for ai models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer