pith:UFEZKARI
Off-Policy Learning with Limited Supply
Greedy off-policy learning is suboptimal when supply is limited, and superior policies exist that allocate items based on relative expected rewards across users.
arxiv:2603.18702 v4 · 2026-03-19 · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{UFEZKARIFIPNHMH6773EP3JCTI}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Conventional greedy OPL approaches may fail to maximize the policy performance, and demonstrate that policies with superior performance must exist in limited supply settings.
That logged data from an unconstrained behavior policy can be used to learn a policy that correctly accounts for future users' relative valuations under limited supply without additional assumptions on the arrival process or reward distributions.
OPLS is a new off-policy method for contextual bandits with limited supply that outperforms greedy approaches by prioritizing items with higher relative expected rewards for the current user.
Receipt and verification
| First computed | 2026-05-20T00:04:28.780340Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
a1499502282a1ed3b0fefff647ed229a1be6d1385f3f244f84809a86e5e43767
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/UFEZKARIFIPNHMH6773EP3JCTI \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: a1499502282a1ed3b0fefff647ed229a1be6d1385f3f244f84809a86e5e43767
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "040b1b4756ef9bc2f2dbfa4df5cf000a85c5a82ca2f54a3861f81356a0aa5c4e",
"cross_cats_sorted": [],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.LG",
"submitted_at": "2026-03-19T10:01:39Z",
"title_canon_sha256": "2454d0a0cbbcd1c6815d4aac9e68115cfe1bed1301ce8fcaec3cdfa0072f2760"
},
"schema_version": "1.0",
"source": {
"id": "2603.18702",
"kind": "arxiv",
"version": 4
}
}