OPD-Evolver uses on-policy self-distillation in fast interaction and slow attribution loops to build agents with holistic memory competence, outperforming prior systems by up to 11.5% and allowing a 9B model to compete with much larger ones.
arXiv preprint arXiv:2111.10657 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Adding Bertz index regression and MMFF94 strain penalty as auxiliary losses to a GINE GNN yields small but statistically significant OOD AUC improvements (up to +0.0066) on natural products when trained on drug-like molecules.
citing papers explorer
-
OPD-Evolver: Cultivating Holistic Agent Evolver via On-Policy Distillation
OPD-Evolver uses on-policy self-distillation in fast interaction and slow attribution loops to build agents with holistic memory competence, outperforming prior systems by up to 11.5% and allowing a 9B model to compete with much larger ones.