LongGenBench: Benchmarking Long-Form Generation in Long Context

Wu, Yuhao, Hee, Ming Shan, Hu, Zhiqing, Lee, Roy Ka-Wei , year = · 2024 · arXiv 2409.02076

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

VeriCache: Turning Lossy KV Cache into Lossless LLM Inference

cs.AR · 2026-05-17 · unverdicted · novelty 6.0

VeriCache turns lossy KV cache compression into lossless LLM inference by drafting with compressed cache and verifying drafts with full cache, achieving up to 4x throughput with identical outputs.

IceCache: Memory-efficient KV-cache Management for Long-Sequence LLMs

cs.LG · 2026-04-12 · unverdicted · novelty 6.0

IceCache combines semantic token clustering with PagedAttention to keep only 25% of the KV cache tokens while retaining 99% accuracy on LongBench and matching or beating prior offloading methods in latency.

Language models fail at extended rule following

cs.CL · 2026-05-03 · unverdicted · novelty 5.0 · 2 refs

LLMs fail at extended counting of repeated characters due to finite internal states, with abrupt errors persisting across model scales and inference methods.

citing papers explorer

Showing 3 of 3 citing papers.

VeriCache: Turning Lossy KV Cache into Lossless LLM Inference cs.AR · 2026-05-17 · unverdicted · none · ref 67
VeriCache turns lossy KV cache compression into lossless LLM inference by drafting with compressed cache and verifying drafts with full cache, achieving up to 4x throughput with identical outputs.
IceCache: Memory-efficient KV-cache Management for Long-Sequence LLMs cs.LG · 2026-04-12 · unverdicted · none · ref 10
IceCache combines semantic token clustering with PagedAttention to keep only 25% of the KV cache tokens while retaining 99% accuracy on LongBench and matching or beating prior offloading methods in latency.
Language models fail at extended rule following cs.CL · 2026-05-03 · unverdicted · none · ref 9 · 2 links
LLMs fail at extended counting of repeated characters due to finite internal states, with abrupt errors persisting across model scales and inference methods.

LongGenBench: Benchmarking Long-Form Generation in Long Context

fields

years

verdicts

representative citing papers

citing papers explorer