pith:MTHKQKRG
Improving alignment of dialogue agents via targeted human judgements
Sparrow dialogue agent uses separate human judgments on natural language rules and evidence citations to outperform baselines in preference and safety.
arxiv:2209.14375 v1 · 2022-09-28 · cs.LG · cs.CL
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{MTHKQKRGMSXMB7W4XJJO5C66RV}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Sparrow is preferred more often than baselines while being more resilient to adversarial probing by humans, violating our rules only 8% of the time when probed.
That separate human judgments on the listed natural language rules reliably capture the intended notions of helpfulness, correctness, and harmlessness without introducing new biases or inconsistencies.
Sparrow uses targeted rule-based human feedback and evidence provision to outperform baselines in preference while violating rules only 8% of the time under adversarial probing.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:39:22.267587Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
64cea82a2664aec0fedcba52ee8bde8d4838e779018cbd75f2e3cf54ad55cfff
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/MTHKQKRGMSXMB7W4XJJO5C66RV \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 64cea82a2664aec0fedcba52ee8bde8d4838e779018cbd75f2e3cf54ad55cfff
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "04e14b3529d962d858e47fc8446a8e656662bf3ecfafa19e8b3eedf48831a137",
"cross_cats_sorted": [
"cs.CL"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.LG",
"submitted_at": "2022-09-28T19:04:43Z",
"title_canon_sha256": "d2be7c9a82e3da3903cadd379c977e7e799ce3051361fe1ce7286e483432fd5b"
},
"schema_version": "1.0",
"source": {
"id": "2209.14375",
"kind": "arxiv",
"version": 1
}
}