PolicyLong shifts long-context data synthesis to an on-policy loop that re-screens contexts using the evolving model's entropy landscape, producing a self-curriculum that outperforms static offline baselines with larger gains at longer lengths.
Entropylong: Effective long-context training via predictive uncertainty.arXiv preprint arXiv:2510.02330
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
XekRung achieves state-of-the-art performance on cybersecurity benchmarks among same-scale models via tailored data synthesis and multi-stage training while retaining strong general capabilities.
citing papers explorer
-
PolicyLong: Towards On-Policy Context Extension
PolicyLong shifts long-context data synthesis to an on-policy loop that re-screens contexts using the evolving model's entropy landscape, producing a self-curriculum that outperforms static offline baselines with larger gains at longer lengths.
-
XekRung Technical Report
XekRung achieves state-of-the-art performance on cybersecurity benchmarks among same-scale models via tailored data synthesis and multi-stage training while retaining strong general capabilities.