Here, the first inequality follows fromJπ n+1 ≥V n+1 and the monotonicity of operatorQπ n+1, and the last inequality follows from (15)

For anyπ∈ΠΠΠbc, we haveQπ n+1 ∈Π n+1 for alln∈[N−1] 0, J π n =c n +Q π n+1[J π n+1]≥c n +Q π n+1[Vn+1] = Γ Qπ n+1 n [Vn+1]≥Γ n[Vn+1]

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Scalable Bi-causal Optimal Transport via KL Relaxation and Policy Gradients

math.OC · 2026-05-17 · unverdicted · novelty 7.0

A KL-relaxed formulation of bi-causal optimal transport is solved via policy gradients with proven convergence to the original problem and nonasymptotic regret guarantees for the resulting algorithm.

citing papers explorer

Showing 1 of 1 citing paper.

Scalable Bi-causal Optimal Transport via KL Relaxation and Policy Gradients math.OC · 2026-05-17 · unverdicted · none · ref 1
A KL-relaxed formulation of bi-causal optimal transport is solved via policy gradients with proven convergence to the original problem and nonasymptotic regret guarantees for the resulting algorithm.

Here, the first inequality follows fromJπ n+1 ≥V n+1 and the monotonicity of operatorQπ n+1, and the last inequality follows from (15)

fields

years

verdicts

representative citing papers

citing papers explorer