pith. sign in

It typically contains a blend of sand, coconut coir, perlite, and peat moss or sphagnum moss

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.LG 1

years

2024 1

verdicts

CONDITIONAL 1

representative citing papers

RLHF Workflow: From Reward Modeling to Online RLHF

cs.LG · 2024-05-13 · conditional · novelty 5.0

The paper supplies a complete open-source recipe for online iterative RLHF that uses proxy preference models and reaches competitive performance on AlpacaEval-2, Arena-Hard, and MT-Bench.

citing papers explorer

Showing 1 of 1 citing paper.

  • RLHF Workflow: From Reward Modeling to Online RLHF cs.LG · 2024-05-13 · conditional · none · ref 15

    The paper supplies a complete open-source recipe for online iterative RLHF that uses proxy preference models and reaches competitive performance on AlpacaEval-2, Arena-Hard, and MT-Bench.