OAR distills specialized generation orders from any-order AR models via self-distillation, improving FID from 2.39 to 2.17 on ImageNet 256x256 while preserving multi-task flexibility.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
CDS-trained BabyLMs show earlier and more appropriate production in a new frame-completion task while FineWeb-edu models lead on comprehension benchmarks, indicating current tests underestimate CDS benefits.
citing papers explorer
-
Distilling Specialized Orders for Visual Generation
OAR distills specialized generation orders from any-order AR models via self-distillation, improving FID from 2.39 to 2.17 on ImageNet 256x256 while preserving multi-task flexibility.