Reasoning models naturally compress context via thinking traces, with reward-constrained optimization yielding 17-23% gains over baselines on long-context QA at high compression ratios.
No Mean Feat: Simple, Strong Baselines for Context Compression
3 Pith papers cite this work. Polarity classification is still indexing.
abstract
Context compression reduces Transformer inference costs by replacing lengthy inputs with shorter pre-computed representations. It carries significant benefits for retrieval-augmented generation (RAG) and has attracted growing research attention. However, progress remains difficult to measure due to inconsistent evaluations and baselines. We design a standard, easy-to-reproduce evaluation suite for context compression, BenchPress, along with simple, high-performance baselines for English reading comprehension. BenchPress supports benchmarking across model scales, datasets, compression ratios, and short ($<$1K tokens) to mid-range ($<$8K tokens) contexts. While the suite is applicable to any compression paradigm, our baselines target soft context compression. We establish two simple baselines that strongly outperform the widely used causal compression-token approach: mean pooling and a bidirectional compression-token variant. Our results show the benefit of bidirectional attention when computing compressed representations, and that simple pooling is an expressive compression operator.
representative citing papers
Modern text encoders resist second-order collapse under mean pooling because token embeddings concentrate tightly within texts, and this resistance correlates with stronger downstream performance.
Vision-based optical context compression performs no better than direct autoencoding baselines like mean pooling or hierarchical encoders across compression ratios.
citing papers explorer
-
Optical Context Compression Is Just (Bad) Autoencoding
Vision-based optical context compression performs no better than direct autoencoding baselines like mean pooling or hierarchical encoders across compression ratios.