Pith Number

pith:JDURUAVD

pith:2026:JDURUAVDSXWPBLCSDSGB7KZR2R

not attested not anchored not stored refs resolved

KVPO: ODE-Native GRPO for Autoregressive Video Alignment via KV Semantic Exploration

Jun Zhou, Kaixi Cong, Ruicheng Zhang, Shuiyang Mao, Wei Liu, Xiu Li, Zhizhou Zhong, Zunnan Xu

KVPO aligns streaming autoregressive video generators by routing semantic variations through the KV cache and modeling policies via velocity energy in ODE space.

arxiv:2605.14278 v1 · 2026-05-14 · cs.CV

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{JDURUAVDSXWPBLCSDSGB7KZR2R}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

KVPO introduces a causal-semantic exploration paradigm that relocates the source of variation from stochastic noise to the historical KV cache, constructing semantically diverse generation branches that remain strictly on the data manifold, and a velocity-field surrogate policy based on Trajectory Velocity Energy that yields a reward-weighted contrastive objective fully consistent with the native ODE formulation.

C2weakest assumption

That stochastically routing historical KV entries produces semantically diverse generation branches that remain strictly on the data manifold without introducing artifacts or deviating from the model's learned distribution.

C3one line summary

KVPO aligns streaming autoregressive video generators with human preferences via ODE-native GRPO, using KV cache for semantic exploration and TVE for velocity-based policy modeling, yielding gains in quality and alignment.

References

33 extracted · 33 resolved · 12 Pith anchors

[1] arXiv preprint arXiv:2511.16955 (2025) 3 2025

[2] arXiv preprint arXiv:2603.17461 (2026) 2026

[3] arXiv preprint arXiv:2603.21299 (2026) 2026

[4] LoRA: Low-Rank Adaptation of Large Language Models 2021 · arXiv:2106.09685

[5] Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion 2025 · arXiv:2506.08009

Cited by

1 paper in Pith

Embedding-perturbed Exploration Preference Optimization for Flow Models

Receipt and verification

First computed	2026-05-17T23:39:10.316265Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

48e91a02a395ecf0ac521c8c1fab31d451214957c417e859b466b8474f3f7860

Aliases

arxiv: 2605.14278 · arxiv_version: 2605.14278v1 · doi: 10.48550/arxiv.2605.14278 · pith_short_12: JDURUAVDSXWP · pith_short_16: JDURUAVDSXWPBLCS · pith_short_8: JDURUAVD

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/JDURUAVDSXWPBLCSDSGB7KZR2R \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 48e91a02a395ecf0ac521c8c1fab31d451214957c417e859b466b8474f3f7860

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "593bffff0e2365eaac80c1b925b27ac35f20ae5568ddceb7175a0ec86eda6eb5",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2026-05-14T02:24:46Z",
    "title_canon_sha256": "b2a8e643e00a2a42fde144366c302edf531898314362e5fc7e91e727fb8a3b93"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.14278",
    "kind": "arxiv",
    "version": 1
  }
}