pith. sign in
Pith Number

pith:4O4D7MDD

pith:2025:4O4D7MDDU2DW6TUPCXOFBT3LFS
not attested not anchored not stored refs resolved

Unifying Entropy Regularization in Optimal Control: From and Back to Classical Objectives via Iterated Soft Policies and Path Integral Solutions

Ajinkya Bhole, Guillaume Crevecoeur, Mohammad Mahmoudi Filabadi, Tom Lefebvre

A KL-regularized umbrella problem unifies optimal control formulations and recovers the classical objectives through iteration of soft-policy solutions.

arxiv:2512.06109 v3 · 2025-12-05 · math.OC · cs.LG · cs.RO · cs.SY · eess.SY

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{4O4D7MDDU2DW6TUPCXOFBT3LFS}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

iterating their solutions recovers the original objectives. We further identify a synchronized case of soft-policy RSOC where the policy and transition KL weights coincide, yielding a linear Bellman operator, path-integral solution, and compositionality -- extending these computationally favourable properties to a broad class of control problems.

C2weakest assumption

the soft-policy formulations majorize the original SOC and RSOC, thus, iterating their solutions recovers the original objectives.

C3one line summary

A KL-regularized optimal control umbrella recovers classical SOC and RSOC via iterated soft policies and yields linear Bellman operators with path-integral solutions when policy and transition weights coincide.

References

5 extracted · 5 resolved · 1 Pith anchors

[1] Dvijotham, K. and Todorov, E. (2012). Linearly solvable optimal control.Reinforcement learning and approxi- mate dynamic programming for feedback control, 119– 2012
[2] F¨ ollmer, H. and Schied, A. (2002). Convex measures of risk and trading constraints.Finance and stochastics, 6(4), 429–447. Ito, K. and Kashima, K. (2024). Risk-sensitive control as inference with r´ 2002
[3] Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review 2024 · arXiv:1805.00909
[4] Neumann, G. (2011). Variational inference for policy search in changing situations. InInternational confer- ence on machine learning, 817–824. Nishimura, H., Mehr, N., Gaidon, A., and Schwager, M. (20 2011
[5] Toussaint, M. (2009). Robot trajectory optimization via approximate inference. InInternational conference on machine learning, 1049–1056. Toussaint, M. and Storkey, A. (2006). Probabilistic infer- enc 2009
Receipt and verification
First computed 2026-05-18T02:44:32.219957Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

e3b83fb063a6876f4e8f15dc50cf6b2ca8023063510742fc32ab4da1bd610539

Aliases

arxiv: 2512.06109 · arxiv_version: 2512.06109v3 · doi: 10.48550/arxiv.2512.06109 · pith_short_12: 4O4D7MDDU2DW · pith_short_16: 4O4D7MDDU2DW6TUP · pith_short_8: 4O4D7MDD
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/4O4D7MDDU2DW6TUPCXOFBT3LFS \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: e3b83fb063a6876f4e8f15dc50cf6b2ca8023063510742fc32ab4da1bd610539
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "a677cf1c1ee105540b917caf06675cb48473ee71cace79816dea7d49ef178457",
    "cross_cats_sorted": [
      "cs.LG",
      "cs.RO",
      "cs.SY",
      "eess.SY"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "math.OC",
    "submitted_at": "2025-12-05T19:31:39Z",
    "title_canon_sha256": "4172af09ab2c327b0112aa1a5324524a384e5fcd4c8348feebc3a8f8eafe8255"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2512.06109",
    "kind": "arxiv",
    "version": 3
  }
}