pith:Z7QLAXPZ
Reinforcement Learning with Semantic Rewards Enables Low-Resource Language Expansion without Alignment Tax
Reinforcement learning with embedding-level semantic rewards lets LLMs add low-resource languages without the usual loss of general skills.
arxiv:2605.14366 v1 · 2026-05-14 · cs.CL · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{Z7QLAXPZNEOC6HXYDPQ6MFIUMY}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Experiments show that our method acquires low-resource capabilities while markedly mitigating alignment tax, preserving general competence more effectively than SFT.
That embedding-level semantic rewards reliably capture and preserve intended meaning across languages without introducing new biases or requiring the model to have strong pretrained semantic understanding in the target language.
Reinforcement learning with semantic rewards lets LLMs gain low-resource language skills without the alignment tax that degrades general capabilities in supervised fine-tuning.
References
Formal links
Receipt and verification
| First computed | 2026-05-17T23:39:07.886359Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
cfe0b05df9691c2f1ef81be1e615146610fa8b61949ed71f6c188e9c19d90c27
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/Z7QLAXPZNEOC6HXYDPQ6MFIUMY \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: cfe0b05df9691c2f1ef81be1e615146610fa8b61949ed71f6c188e9c19d90c27
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "49333ace483f564e023ef1d6138125147e77396eb457ea09bcd626547a90e4d7",
"cross_cats_sorted": [
"cs.LG"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.CL",
"submitted_at": "2026-05-14T04:47:22Z",
"title_canon_sha256": "f4863bbe0621727b76fdaf7b4479f8558cd16a2da763d0125b128359c6c033ea"
},
"schema_version": "1.0",
"source": {
"id": "2605.14366",
"kind": "arxiv",
"version": 1
}
}