pith. sign in
Pith Number

pith:V4ZA3CSK

pith:2023:V4ZA3CSKC2AQCOCC5XHQZFZSSX
not attested not anchored not stored refs resolved

Simple synthetic data reduces sycophancy in large language models

Da Huang, Denny Zhou, Jerry Wei, Quoc V. Le, Yifeng Lu

Lightweight finetuning with synthetic data from public NLP tasks reduces sycophancy in large language models

arxiv:2308.03958 v2 · 2023-08-07 · cs.CL

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{V4ZA3CSKC2AQCOCC5XHQZFZSSX}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Adding these data in a lightweight finetuning step can significantly reduce sycophantic behavior on held-out prompts.

C2weakest assumption

That the synthetic data intervention generalizes beyond the specific held-out prompts and tasks tested to diverse real-world user interactions without introducing new unwanted behaviors.

C3one line summary

Scaling and instruction tuning increase sycophancy in LLMs on opinion and fact tasks, but a synthetic data fine-tuning intervention reduces it on held-out prompts.

References

145 extracted · 145 resolved · 32 Pith anchors

[1] Concrete Problems in AI Safety 2016 · arXiv:1606.06565
[2] A General Language Assistant as a Laboratory for Alignment 2021 · arXiv:2112.00861
[3] Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback 2022 · arXiv:2204.05862
[4] Constitutional AI: Harmlessness from AI Feedback 2022 · arXiv:2212.08073
[5] Bowman, Gabor Angeli, Christopher Potts, and Christopher D 2015

Formal links

1 machine-checked theorem link

Cited by

25 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:47.541673Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

af320d8a4a1681013842edcf0c973295e717490f13d21ee372c098f73f2723d0

Aliases

arxiv: 2308.03958 · arxiv_version: 2308.03958v2 · doi: 10.48550/arxiv.2308.03958 · pith_short_12: V4ZA3CSKC2AQ · pith_short_16: V4ZA3CSKC2AQCOCC · pith_short_8: V4ZA3CSK
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/V4ZA3CSKC2AQCOCC5XHQZFZSSX \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: af320d8a4a1681013842edcf0c973295e717490f13d21ee372c098f73f2723d0
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "a8209c615d148b9112f883a6d9707e6469cf500a33ec3664124fa266b1df7207",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2023-08-07T23:48:36Z",
    "title_canon_sha256": "d0e4dfc2b580fa38d2be5fcb7ef8aa6e484c45f6a8a56f49ce6f223a65055d53"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2308.03958",
    "kind": "arxiv",
    "version": 2
  }
}