LongBench: A bilingual, multitask benchmark for long context understanding

Bai, Y · 2024

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

MoE-nD: Per-Layer Mixture-of-Experts Routing for Multi-Axis KV Cache Compression

cs.LG · 2026-04-20 · unverdicted · novelty 7.0

Per-layer mixture-of-experts routing selects heterogeneous eviction-quantization tuples for KV cache compression, matching uncompressed accuracy at 14x reduction on LongBench subsets where uniform baselines degrade.

Minimal-Intervention KV Retention via Set-Conditioned Diversity

cs.LG · 2026-05-14 · conditional · novelty 5.0

A minimal scoring modification to TriAttention using greedy facility-location selection with V-space redundancy penalty improves KV retention at budgets 64 and 128 on distilled reasoning models under matched-memory held-out evaluation.

citing papers explorer

Showing 2 of 2 citing papers.

MoE-nD: Per-Layer Mixture-of-Experts Routing for Multi-Axis KV Cache Compression cs.LG · 2026-04-20 · unverdicted · none · ref 2
Per-layer mixture-of-experts routing selects heterogeneous eviction-quantization tuples for KV cache compression, matching uncompressed accuracy at 14x reduction on LongBench subsets where uniform baselines degrade.
Minimal-Intervention KV Retention via Set-Conditioned Diversity cs.LG · 2026-05-14 · conditional · none · ref 1
A minimal scoring modification to TriAttention using greedy facility-location selection with V-space redundancy penalty improves KV retention at budgets 64 and 128 on distilled reasoning models under matched-memory held-out evaluation.

LongBench: A bilingual, multitask benchmark for long context understanding

fields

years

verdicts

representative citing papers

citing papers explorer