pith. sign in
Pith Number

pith:23IZB7K2

pith:2023:23IZB7K2H2STHS5OGU3SX45Y43
not attested not anchored not stored refs pending

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Anand Siththaranjan, Anca Dragan, Andi Peng, Charbel-Rapha\"el Segerie, Claudia Shi, David Krueger, David Lindner, Dmitrii Krasheninnikov, Dorsa Sadigh, Dylan Hadfield-Menell, Erdem B{\i}y{\i}k, Eric J. Michaud, Jacob Pfau, Javier Rando, J\'er\'emy Scheurer, Lauro Langosco, Max Nadeau, Mehul Damani, Micah Carroll, Pedro Freire, Peter Hase, Phillip Christoffersen, Rachel Freedman, Samuel Marks, Stephen Casper, Stewart Slocum, Thomas Krendl Gilbert, Tomasz Korbak, Tony Wang, Usman Anwar, Xander Davies, Xin Chen

RLHF, the dominant method for aligning large language models with human goals, carries fundamental limitations that incremental fixes cannot fully resolve.

arxiv:2307.15217 v2 · 2023-07-27 · cs.AI · cs.CL · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{23IZB7K2H2STHS5OGU3SX45Y43}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

RLHF has emerged as the central method used to finetune state-of-the-art large language models but has fundamental limitations, emphasizing the importance of a multi-faceted approach to the development of safer AI systems.

C2weakest assumption

That the identified open problems represent fundamental limitations of RLHF rather than challenges that can be resolved through incremental improvements or better implementation.

C3one line summary

RLHF has significant open problems and fundamental limitations that require a multi-faceted approach for safer AI development.

Formal links

2 machine-checked theorem links

Cited by

45 papers in Pith

Receipt and verification
First computed 2026-05-17T23:39:21.418541Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

d6d190fd5a3ea533cbae35372bf3b8e6ddf630437eb272f01b14c5a437f010e0

Aliases

arxiv: 2307.15217 · arxiv_version: 2307.15217v2 · doi: 10.48550/arxiv.2307.15217 · pith_short_12: 23IZB7K2H2ST · pith_short_16: 23IZB7K2H2STHS5O · pith_short_8: 23IZB7K2
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/23IZB7K2H2STHS5OGU3SX45Y43 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: d6d190fd5a3ea533cbae35372bf3b8e6ddf630437eb272f01b14c5a437f010e0
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "b2d5a1ffd2f126b3687464bdd54878590cc00c30071a32d643bc8ba98db2121f",
    "cross_cats_sorted": [
      "cs.CL",
      "cs.LG"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2023-07-27T22:29:25Z",
    "title_canon_sha256": "f64b8e348d873883aaed9189c75a52a859f0648b3f54a9627b55a100e94a8e16"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2307.15217",
    "kind": "arxiv",
    "version": 2
  }
}