pith:SRP7COU5
AIS: Adaptive Importance Sampling for Quantized RL
Adaptive Importance Sampling corrects non-stationary bias from low-precision rollouts while keeping their speed gains in LLM RL.
arxiv:2605.13907 v1 · 2026-05-13 · stat.ML · cs.AI · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{SRP7COU5GB3YODBFGUGIILN7V2}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
AIS matches the BF16 baseline on most tasks while retaining the 1.5 to 2.76x rollout speedup of FP8 by combining weight reliability, divergence severity, and variance amplification into a per-batch mixing coefficient that interpolates between uncorrected and importance-weighted gradients.
The three real-time diagnostics can be combined into a single mixing coefficient that reliably preserves early-training exploration benefits while suppressing later destabilizing bias across different models, tasks, and training stages without introducing new instabilities.
AIS adaptively corrects non-stationary policy gradient bias in quantized LLM RL, matching BF16 performance while retaining 1.5-2.76x FP8 rollout speedup.
References
Receipt and verification
| First computed | 2026-05-17T23:39:18.871843Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
945ff13a9d3077870c25350c842dbfaeaae04965786a3c765793336b0568f0e5
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/SRP7COU5GB3YODBFGUGIILN7V2 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 945ff13a9d3077870c25350c842dbfaeaae04965786a3c765793336b0568f0e5
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "64b96b9353f36c4ca59c24e29374fefa83ca3be7632760b715f5993f507986ce",
"cross_cats_sorted": [
"cs.AI",
"cs.LG"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "stat.ML",
"submitted_at": "2026-05-13T03:36:57Z",
"title_canon_sha256": "364ed5714d6e429a86c3f08ba45b900a8118869451a10a9058dc9019f851f158"
},
"schema_version": "1.0",
"source": {
"id": "2605.13907",
"kind": "arxiv",
"version": 1
}
}