pith. sign in
Pith Number

pith:MFMFMCOO

pith:2026:MFMFMCOOW6LMWSYDKBNN2HV4BW
not attested not anchored not stored refs pending

Flow Matching for Offline Reinforcement Learning with Discrete Actions

Fairoz Nower Khan, Haibo Yang, Nabuat Zaman Nahim, Peizhong Ju, Ruiquan Huang

Flow matching with continuous-time Markov chains recovers the optimal policy for offline RL with discrete actions under idealized conditions.

arxiv:2602.06138 v2 · 2026-02-05 · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{MFMFMCOOW6LMWSYDKBNN2HV4BW}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

We theoretically show that, under idealized conditions, optimizing this objective recovers the optimal policy.

C2weakest assumption

The idealized conditions required for the theoretical recovery of the optimal policy via the Q-weighted flow matching objective on continuous-time Markov chains.

C3one line summary

Flow matching is adapted to discrete actions via continuous-time Markov chains and Q-weighted objectives, recovering optimal policies under idealized conditions and outperforming baselines in multi-agent and multi-objective offline RL experiments.

Cited by

2 papers in Pith

Receipt and verification
First computed 2026-05-18T03:09:23.801072Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

61585609ceb796cb4b03505add1ebc0db5caf821c4e12b7a9affbd03d2823f50

Aliases

arxiv: 2602.06138 · arxiv_version: 2602.06138v2 · doi: 10.48550/arxiv.2602.06138 · pith_short_12: MFMFMCOOW6LM · pith_short_16: MFMFMCOOW6LMWSYD · pith_short_8: MFMFMCOO
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/MFMFMCOOW6LMWSYDKBNN2HV4BW \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 61585609ceb796cb4b03505add1ebc0db5caf821c4e12b7a9affbd03d2823f50
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "7873ca2cb6c902caf3f985d1bb5ba63fdb961700bedac3b36377cb032ad74207",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2026-02-05T19:13:44Z",
    "title_canon_sha256": "edd1c32da07207bd2adb1ae628d3824dd48dd46a905ebe0c5d9504ee225cd208"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2602.06138",
    "kind": "arxiv",
    "version": 2
  }
}