Please refer to Appendix C.2.3 for the proof

Consider Algorithm 1, then we have T ∑ t=1 f (t) ( c(t) ) − min c∈CTV T ∑ t=1 f (t)(c) ≤ 2H √ 2|S ||A|T, where f (t)(c) = H ∑ h=1 ∑ (s,a)∈S ×A ch(s, a)(ˆdπE h (s, a) − dπ(t) h (s, a)) · 2012

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled Analysis

cs.LG · 2022-08-03 · unverdicted · novelty 7.0

TV-AIL achieves a horizon-independent imitation gap of O(min{1, sqrt(|S|/N)}) via stage-coupled dynamic programming analysis on locomotion-abstracted MDPs.

citing papers explorer

Showing 1 of 1 citing paper.

Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled Analysis cs.LG · 2022-08-03 · unverdicted · none · ref 9
TV-AIL achieves a horizon-independent imitation gap of O(min{1, sqrt(|S|/N)}) via stage-coupled dynamic programming analysis on locomotion-abstracted MDPs.

Please refer to Appendix C.2.3 for the proof

fields

years

verdicts

representative citing papers

citing papers explorer