Pith Number

pith:AD4OU3S5

pith:2023:AD4OU3S57S5Y2GK4FCGKHPM5GD

not attested not anchored not stored refs resolved

RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

Abhinav Rastogi, Colton Bishop, Ethan Hall, Harrison Lee, Hassan Mansoor, Johan Ferret, Kellie Lu, Samrat Phatale, Sushant Prakash, Thomas Mesnard, Victor Carbune

Reinforcement learning from AI feedback matches human feedback performance for aligning large language models.

arxiv:2309.00267 v3 · 2023-09-01 · cs.CL · cs.AI · cs.LG

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{AD4OU3S57S5Y2GK4FCGKHPM5GD}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Across the tasks of summarization, helpful dialogue generation, and harmless dialogue generation, we show that RLAIF achieves comparable performance to RLHF. ... we introduce direct-RLAIF (d-RLAIF) ... which achieves superior performance to canonical RLAIF.

C2weakest assumption

That the preferences generated by an off-the-shelf LLM are high-quality enough to serve as a substitute for human preferences in training the reward model.

C3one line summary

RLAIF matches RLHF on summarization and dialogue tasks, with a direct-RLAIF variant achieving superior results by using LLM rewards directly during training.

References

98 extracted · 98 resolved · 15 Pith anchors

[3] E., Fort, S., Lanham, T., Telleen-Lawton, T., Conerly, T., Henighan, T., Hume, T., Bowman, S 2022

[4] D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al 1901

[6] F., Leike, J., Brown, T., Martic, M., Legg, S., and Amodei, D 2017

[8] RAFT : Reward ranked finetuning for generative foundation model alignment 2023

[9] Understanding dataset difficulty with V -usable information 2022

Formal links

2 machine-checked theorem links

Cited by

35 papers in Pith

Convex Optimization for Alignment and Preference Learning on a Single GPU

Policy Gradient with Kernel Quadrature

WildFeedback: Aligning LLMs With In-situ User Interactions And Feedback

Social Human Robot Embodied Conversation (SHREC) Dataset: Benchmarking Foundational Models' Social Reasoning

LLM Evaluators Recognize and Favor Their Own Generations

Receipt and verification

First computed	2026-05-17T23:38:50.098142Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

00f8ea6e5dfcbb8d195c288ca3bd9d30ccec365083e1091ebe19ac2b0a61252f

Aliases

arxiv: 2309.00267 · arxiv_version: 2309.00267v3 · doi: 10.48550/arxiv.2309.00267 · pith_short_12: AD4OU3S57S5Y · pith_short_16: AD4OU3S57S5Y2GK4 · pith_short_8: AD4OU3S5

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/AD4OU3S57S5Y2GK4FCGKHPM5GD \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 00f8ea6e5dfcbb8d195c288ca3bd9d30ccec365083e1091ebe19ac2b0a61252f

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "e06fbc5dabdd57615fbf708dac24235e084d3e237bd3abd328f2bc19edfd90ee",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.LG"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2023-09-01T05:53:33Z",
    "title_canon_sha256": "30a71bef573f4df6e33a747f2c8824790bde2c0d53c2ef6779f1df973bc3eb36"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2309.00267",
    "kind": "arxiv",
    "version": 3
  }
}