LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language Models , booktitle =

Chi Han, Qifan Wang, Hao Peng, Wenhan Xiong, Yu Chen, Heng Ji · 2024 · DOI 10.18653/v1/2024.naacl-long.222

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

open at publisher browse 4 citing papers

citation-role summary

baseline 1

citation-polarity summary

baseline 1

representative citing papers

Training-Inference Consistent Segmented Execution for Long-Context LLMs

cs.CL · 2026-05-12 · conditional · novelty 6.0

A training-inference consistent segmented execution framework for long-context LLMs matches full-context performance with substantially lower peak memory at very long lengths.

Stacked from One: Multi-Scale Self-Injection for Context Window Extension

cs.CL · 2026-03-05 · unverdicted · novelty 6.0

SharedLLM stacks two copies of a short-context LLM so the lower one compresses context into query-aware multi-grained tokens that are injected only at the lowest layers of the upper one, enabling generalization from 8K training to 128K+ inputs.

Titans: Learning to Memorize at Test Time

cs.LG · 2024-12-31 · unverdicted · novelty 6.0

Titans combine attention for current context with a learnable neural memory for long-term history, achieving better performance and scaling to over 2M-token contexts on language, reasoning, genomics, and time-series tasks.

MATCH: Modulating Attention via In-Context Retrieval for Long-Context Transformers

cs.CL · 2026-06-29 · unverdicted · novelty 4.0

MATCH augments sparsified attention with an efficient in-context retrieval system to boost performance on long-range recall tasks in transformers.

citing papers explorer

Showing 4 of 4 citing papers.

Training-Inference Consistent Segmented Execution for Long-Context LLMs cs.CL · 2026-05-12 · conditional · none · ref 46
A training-inference consistent segmented execution framework for long-context LLMs matches full-context performance with substantially lower peak memory at very long lengths.
Stacked from One: Multi-Scale Self-Injection for Context Window Extension cs.CL · 2026-03-05 · unverdicted · none · ref 15
SharedLLM stacks two copies of a short-context LLM so the lower one compresses context into query-aware multi-grained tokens that are injected only at the lowest layers of the upper one, enabling generalization from 8K training to 128K+ inputs.
Titans: Learning to Memorize at Test Time cs.LG · 2024-12-31 · unverdicted · none · ref 43
Titans combine attention for current context with a learnable neural memory for long-term history, achieving better performance and scaling to over 2M-token contexts on language, reasoning, genomics, and time-series tasks.
MATCH: Modulating Attention via In-Context Retrieval for Long-Context Transformers cs.CL · 2026-06-29 · unverdicted · none · ref 8
MATCH augments sparsified attention with an efficient in-context retrieval system to boost performance on long-range recall tasks in transformers.

LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language Models , booktitle =

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer