pith. sign in
Pith Number

pith:EL2CHLBO

pith:2026:EL2CHLBORUPA4U3HRKJPRTM6RQ
not attested not anchored not stored refs resolved

Latent Action Control for Reasoning-Guided Unified Image Generation

Fuxiang Zhai, Jianyu Lai, Lei Zhu, Shuaibo Li, Sixiang Chen, Tengjun Huang, Yingjin Li

Latent Action Control turns inferred reasoning into hidden continuous actions that guide image generation inside unified models.

arxiv:2605.16961 v1 · 2026-05-16 · cs.CV · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{EL2CHLBORUPA4U3HRKJPRTM6RQ}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

LAC consistently improves compositional and knowledge-grounded generation across GenEval, WISE, and T2I-CompBench, with the largest gains on spatial relations, attribute binding, and world-knowledge-sensitive prompts.

C2weakest assumption

The learned latent action trajectories are actually consumed by the generator and causally affect the output image, as suggested by ablations and latent interventions but without explicit causal verification in the provided description.

C3one line summary

Latent Action Control learns unobserved action trajectories via variational alignment and GRPO to inject reasoning into flow-based image generation, yielding gains on compositional benchmarks.

References

49 extracted · 49 resolved · 20 Pith anchors

[1] Improving image generation with better captions.Computer Science 2023
[2] Training Diffusion Models with Reinforcement Learning 2023 · arXiv:2305.13301
[3] Flux.https://github.com/black-forest-labs/flux 2024
[4] Show, don’t tell: Morphing latent reasoning into image generation 2026
[5] Pixart- σ: Weak-to-strong training of diffusion transformer for 4k text-to-image generation 2024

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-20T00:03:33.074717Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

22f423ac2e8d1e0e53678a92f8cd9e8c0cafb3591de234aa7ccf156135e047d4

Aliases

arxiv: 2605.16961 · arxiv_version: 2605.16961v1 · doi: 10.48550/arxiv.2605.16961 · pith_short_12: EL2CHLBORUPA · pith_short_16: EL2CHLBORUPA4U3H · pith_short_8: EL2CHLBO
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/EL2CHLBORUPA4U3HRKJPRTM6RQ \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 22f423ac2e8d1e0e53678a92f8cd9e8c0cafb3591de234aa7ccf156135e047d4
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "5ec198b59f23ad9e79124371a83f8a6cf298449501d02e5d8c1fa43817a71d7f",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2026-05-16T12:23:20Z",
    "title_canon_sha256": "6b82da9e0d5b618b50e4ee53329c479e1cd7f41aea61efcc8b31460a0a3cfd7f"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.16961",
    "kind": "arxiv",
    "version": 1
  }
}