Either an imperative sentence or a question is permitted

The instructions should be a sequential or compositional instruction containing multiple steps, where each step is related to the previous steps

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Understanding the Effects of RLHF on LLM Generalisation and Diversity

cs.LG · 2023-10-10 · unverdicted · novelty 6.0

RLHF improves OOD generalization over SFT especially under larger distribution shifts but reduces output diversity, revealing a tradeoff in LLM fine-tuning methods.

citing papers explorer

Showing 1 of 1 citing paper.

Understanding the Effects of RLHF on LLM Generalisation and Diversity cs.LG · 2023-10-10 · unverdicted · none · ref 12
RLHF improves OOD generalization over SFT especially under larger distribution shifts but reduces output diversity, revealing a tradeoff in LLM fine-tuning methods.

Either an imperative sentence or a question is permitted

fields

years

verdicts

representative citing papers

citing papers explorer