This survey introduces the Generate-Filter-Control-Replay (GFCR) taxonomy to structure rollout pipelines for RL-based post-training of reasoning LLMs.
Rossi, Prithviraj Ammanabrolu, and Julian McAuley
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
PRISMR replaces in-context list processing with a hypernetwork-generated instance-specific LoRA adapter to reduce parse collapse and improve multimodal listwise ranking performance.
citing papers explorer
-
PRISMR: Overcoming Parse Collapse in Multimodal Listwise Ranking via Parameterized Representation Internalization
PRISMR replaces in-context list processing with a hypernetwork-generated instance-specific LoRA adapter to reduce parse collapse and improve multimodal listwise ranking performance.