pith. sign in
Pith Number

pith:RPVPFU2J

pith:2026:RPVPFU2JO7B23WCUNHIUIXCXTC
not attested not anchored not stored refs resolved

Scalable Bi-causal Optimal Transport via KL Relaxation and Policy Gradients

Haoyang Cao, Jesse Hoekstra, Renyuan Xu, Ruixun Zhang, Yumin Xu

KL relaxation turns bi-causal optimal transport into a policy-gradient problem

arxiv:2605.17271 v1 · 2026-05-17 · math.OC · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{RPVPFU2JO7B23WCUNHIUIXCXTC}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

We establish dynamic programming principles for both the original and relaxed formulations, prove that the relaxed problem converges to the original bi-causal OT problem as the penalty grows, and derive explicit policy-gradient representations for the relaxed objective. Building on these results, we propose a practical policy-gradient algorithm with unbiased mini-batch estimators, variance reduction, and nonasymptotic regret guarantees.

C2weakest assumption

The framework assumes that the KL relaxation preserves the recursive structure of the bi-causal problem sufficiently for dynamic programming and policy gradient methods to apply directly, and that marginal laws can be sampled to enable the stochastic optimization procedure described.

C3one line summary

A KL-relaxed formulation of bi-causal optimal transport is solved via policy gradients with proven convergence to the original problem and nonasymptotic regret guarantees for the resulting algorithm.

References

23 extracted · 23 resolved · 0 Pith anchors

[1] Here, the first inequality follows fromJπ n+1 ≥V n+1 and the monotonicity of operatorQπ n+1, and the last inequality follows from (15)
[2] NX n=0 cn(Yn, Y ′ n) # =E π
[3] 43 The definition ofWbc in (5) as the infimum overπ∈ΠΠΠbc yieldsW bc(µ, µ′)≥inf π0∈Π(µ0,µ′ 0) Eπ0[V0]
[4] Notice that (A.3) holds for any initial coupling inΠ(µ0, µ′ 0)
[5] To establish the second equivalent representation, notice thatγ ϵ ∈ M bc(µ, µ′)⊂Π ΠΠbc and Wbc(µ, µ′) = inf π0∈Π(µ0,µ′

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-20T00:03:49.075129Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

8beaf2d34977c3add85469d1445c579894158e37994a26423efae65a99af3cf3

Aliases

arxiv: 2605.17271 · arxiv_version: 2605.17271v1 · doi: 10.48550/arxiv.2605.17271 · pith_short_12: RPVPFU2JO7B2 · pith_short_16: RPVPFU2JO7B23WCU · pith_short_8: RPVPFU2J
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/RPVPFU2JO7B23WCUNHIUIXCXTC \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 8beaf2d34977c3add85469d1445c579894158e37994a26423efae65a99af3cf3
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "34079c9295c1885dc93323d765ffdc717e05761879e479d3b2d231944c3c02d6",
    "cross_cats_sorted": [
      "cs.LG"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "math.OC",
    "submitted_at": "2026-05-17T05:41:01Z",
    "title_canon_sha256": "6af38eec6ac56c9daa9e2771420cfc020187e38bd974263dee9327ac3819c828"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.17271",
    "kind": "arxiv",
    "version": 1
  }
}