Pith Number

pith:DTOJ6LA4

pith:2026:DTOJ6LA4GNU76LWDO7JI5CNJKP

not attested not anchored not stored refs resolved

Beyond Safety Filtering: Control Barrier Function-Informed Reinforcement Learning for Connected and Automated Vehicles

Bassam Alrifaee, Jianye Xu

Converting Control Barrier Function constraints into rewards guides multi-agent reinforcement learning to higher performance with reduced hyperparameter sensitivity in connected vehicle intersections.

arxiv:2605.16894 v1 · 2026-05-16 · cs.RO · cs.SY · eess.SY

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{DTOJ6LA4GNU76LWDO7JI5CNJKP}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Results show that our method achieves the highest task performance and is less sensitive to reward hyperparameters, yielding consistently strong performance across the tested hyperparameter range.

C2weakest assumption

That converting CBF constraint values under joint MARL actions into a reward signal will reliably guide safe learning without introducing new instabilities or performance trade-offs in the multi-agent intersection setting.

C3one line summary

CBF-informed rewards for multi-agent RL achieve higher task performance and lower sensitivity to hyperparameters than heuristic baselines in a simulated four-way intersection with connected automated vehicles.

References

29 extracted · 29 resolved · 0 Pith anchors

[1] Deep reinforcement learning for autonomous driving: A survey 2022

[2] Reward (mis) design for autonomous driving, 2023

[3] Model-free deep reinforcement learning for urban autonomous driving, 2019

[4] Interpretable end-to-end urban autonomous driving with latent deep reinforcement learning, 2022

[5] Formulation of deep reinforcement learning architecture toward autonomous driving for on-ramp merge, 2017

Formal links

2 machine-checked theorem links

Receipt and verification

First computed	2026-05-20T00:03:28.767960Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

1cdc9f2c1c3369ff2ec377d28e89a953e4d25a72954390a2d65591bbdb4e19f9

Aliases

arxiv: 2605.16894 · arxiv_version: 2605.16894v1 · doi: 10.48550/arxiv.2605.16894 · pith_short_12: DTOJ6LA4GNU7 · pith_short_16: DTOJ6LA4GNU76LWD · pith_short_8: DTOJ6LA4

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/DTOJ6LA4GNU76LWDO7JI5CNJKP \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 1cdc9f2c1c3369ff2ec377d28e89a953e4d25a72954390a2d65591bbdb4e19f9

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "5f85ffaa2ee6e319efaa1018677c86535ef4427802f65f93bc42a18383be1b65",
    "cross_cats_sorted": [
      "cs.SY",
      "eess.SY"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.RO",
    "submitted_at": "2026-05-16T09:12:59Z",
    "title_canon_sha256": "2a9e2e8c6745166ad0f680ab9e26e381104876c61ab53032a5784727f3070290"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.16894",
    "kind": "arxiv",
    "version": 1
  }
}