Journal of Machine Learning Research , volume = 21, number = 140, pages =

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Combining On-Policy Optimization and Distillation for Long-Context Reasoning in Large Language Models

cs.CL · 2026-05-12 · unverdicted · novelty 7.0

dGRPO merges outcome-based policy optimization with dense teacher guidance from on-policy distillation, yielding more stable long-context reasoning on the new LongBlocks synthetic dataset.

Improving language models by retrieving from trillions of tokens

cs.CL · 2021-12-08 · unverdicted · novelty 7.0

RETRO matches GPT-3 and Jurassic-1 performance on the Pile benchmark using 25 times fewer parameters by conditioning on retrieved chunks from a 2-trillion-token database.

citing papers explorer

Showing 2 of 2 citing papers.

Combining On-Policy Optimization and Distillation for Long-Context Reasoning in Large Language Models cs.CL · 2026-05-12 · unverdicted · none · ref 54
dGRPO merges outcome-based policy optimization with dense teacher guidance from on-policy distillation, yielding more stable long-context reasoning on the new LongBlocks synthetic dataset.
Improving language models by retrieving from trillions of tokens cs.CL · 2021-12-08 · unverdicted · none · ref 67
RETRO matches GPT-3 and Jurassic-1 performance on the Pile benchmark using 25 times fewer parameters by conditioning on retrieved chunks from a 2-trillion-token database.

Journal of Machine Learning Research , volume = 21, number = 140, pages =

fields

years

verdicts

representative citing papers

citing papers explorer