Consequently, it triggers the out-of-memory exception at an early stage (128K tokens)

YaRN (Peng et al · 2023

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Stacked from One: Multi-Scale Self-Injection for Context Window Extension

cs.CL · 2026-03-05 · unverdicted · novelty 6.0

SharedLLM stacks two copies of a short-context LLM so the lower one compresses context into query-aware multi-grained tokens that are injected only at the lowest layers of the upper one, enabling generalization from 8K training to 128K+ inputs.

citing papers explorer

Showing 1 of 1 citing paper.

Stacked from One: Multi-Scale Self-Injection for Context Window Extension cs.CL · 2026-03-05 · unverdicted · none · ref 37
SharedLLM stacks two copies of a short-context LLM so the lower one compresses context into query-aware multi-grained tokens that are injected only at the lowest layers of the upper one, enabling generalization from 8K training to 128K+ inputs.

Consequently, it triggers the out-of-memory exception at an early stage (128K tokens)

fields

years

verdicts

representative citing papers

citing papers explorer