pith. sign in
Pith Number

pith:R7TQ5SYZ

pith:2026:R7TQ5SYZNVYMD77VVGFDTLDRQR
not attested not anchored not stored refs resolved

Quantifying Potential Observation Missingness in Inverse Reinforcement Learning

Abhishek Sharma, Alihan Huyuk, Finale Doshi-Velez, Leo Benac

Missing observations in IRL can be quantified by finding the minimal perturbations that make expert actions appear optimal.

arxiv:2605.12831 v1 · 2026-05-12 · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{R7TQ5SYZNVYMD77VVGFDTLDRQR}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

We identify the minimal perturbations to the recorded observations needed for the expert's actions to appear optimal. We develop a practical algorithm for this problem and demonstrate its utility for quantifying the possible extent of missing observations in behavioral datasets through extensive experiments on synthetic navigation tasks, a cancer treatment simulator, and ICU treatment data.

C2weakest assumption

That the minimal perturbations identified correspond to plausible unobserved observations available to the original decision-maker and that standard IRL optimality assumptions hold once those observations are restored.

C3one line summary

A practical algorithm quantifies potential missing observations in IRL by computing minimal perturbations to recorded data that render expert actions optimal.

References

43 extracted · 43 resolved · 3 Pith anchors

[1] Algorithms for inverse reinforcement learning. , author=. International Conference on Machine Learning , year=
[2] Bayesian Inverse Transition Learning: Learning Dynamics From Near-Optimal Trajectories · arXiv:2411.05174
[3] New England Journal of Medicine , volume= 2013
[4] Canadian Journal of Anesthesia/Journal canadien d'anesth 2017
[5] Journal of Critical Care , volume= 2022
Receipt and verification
First computed 2026-05-18T03:09:12.089069Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

8fe70ecb196d70c1fff5a98a39ac71845e4fe48b19403bb1bdbf2a30e6d7aea4

Aliases

arxiv: 2605.12831 · arxiv_version: 2605.12831v1 · doi: 10.48550/arxiv.2605.12831 · pith_short_12: R7TQ5SYZNVYM · pith_short_16: R7TQ5SYZNVYMD77V · pith_short_8: R7TQ5SYZ
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/R7TQ5SYZNVYMD77VVGFDTLDRQR \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 8fe70ecb196d70c1fff5a98a39ac71845e4fe48b19403bb1bdbf2a30e6d7aea4
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "86184f0aed966e2c388423ef4badbd55bb65c46c86ee55a250cfc87277cb81de",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2026-05-12T23:56:36Z",
    "title_canon_sha256": "9c6a72a2b9ee1080f36adecfac989fb8c517bb054331f4bd3914f7d92fa7a6d2"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.12831",
    "kind": "arxiv",
    "version": 1
  }
}