Integrity report for Coherent Off-Policy Improvement of Large Behavior Models with Learned Rewards

A machine-verified record of the checks Pith has run against this paper: detector runs, findings, signed bundle events, and canonical identifiers.

arXiv:2606.02194 · pith:2026:5XBO2DTG53YTFMHPAKQLN4CIUR

0Critical

0Advisory

4Detectors run

2026-06-03Last checked

Paper page arXiv integrity.json bundle.json

Detector runs

claim_evidence completed v1.0.0 · findings 0 · 2026-06-03 03:48:08.362395+00:00

citation_quote_validity skipped v0.1.0 · findings 0 · 2026-06-02 11:50:52.854818+00:00

cited_work_retraction completed v1.0.0 · findings 0 · 2026-06-02 09:26:14.948510+00:00

ai_meta_artifact skipped v1.0.0 · findings 0 · 2026-06-02 03:35:16.117393+00:00

Findings

No public integrity findings for this paper.

Signed record

The machine-readable record for this paper lives at /pith/5XBO2DTG/integrity.json. Pith Number bundles also include signed pith.integrity.v1 events where a Pith Number exists.