pith. sign in
Pith Number

pith:AKT4PA27

pith:2026:AKT4PA27VFE5BJD7YPCZ75ZTEF
not attested not anchored not stored refs pending

Stochastic Minimum-Cost Reach-Avoid Reinforcement Learning

Bai Xue, Jingduo Pan, Taoran Wu, Yiling Xue

Reach-avoid probability certificates turn stochastic safety constraints into a surrogate objective that reinforcement learners can optimize for minimum cost.

arxiv:2605.11975 v2 · 2026-05-12 · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{AKT4PA27VFE5BJD7YPCZ75ZTEF}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

We establish almost sure convergence of the proposed algorithms to locally optimal policies with respect to the resulting objective.

C2weakest assumption

That reach-avoid probability certificates can be computed or approximated accurately enough during learning to serve as a reliable surrogate for the true probabilistic constraint in stochastic environments.

C3one line summary

Introduces RAPCs and a contraction Bellman operator that jointly enforce probabilistic reach-avoid constraints while minimizing expected costs in stochastic RL, with almost-sure convergence to local optima.

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-20T00:04:36.370095Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

02a7c7835fa949d0a47fc3c59ff73321548bc7f5ee6ec7f71478e8fe76a26526

Aliases

arxiv: 2605.11975 · arxiv_version: 2605.11975v2 · doi: 10.48550/arxiv.2605.11975 · pith_short_12: AKT4PA27VFE5 · pith_short_16: AKT4PA27VFE5BJD7 · pith_short_8: AKT4PA27
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/AKT4PA27VFE5BJD7YPCZ75ZTEF \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 02a7c7835fa949d0a47fc3c59ff73321548bc7f5ee6ec7f71478e8fe76a26526
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "0a8a6fcc79b1dafa5a3fd8818da64a1986023d846c8114d99a5698cd87cbac34",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2026-05-12T11:31:36Z",
    "title_canon_sha256": "66d720cff133972aa7823b5ebaac80077487d3c568a5519b8ddfdb598a6eb03a"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.11975",
    "kind": "arxiv",
    "version": 2
  }
}