pith:3WCZOPNL
RefDecoder: Enhancing Visual Generation with Conditional Video Decoding
RefDecoder adds reference-image conditioning to video VAE decoders through attention, yielding up to 2.1 dB PSNR gains and better consistency on I2V, editing, and style-transfer tasks.
arxiv:2605.15196 v1 · 2026-05-14 · cs.CV · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{3WCZOPNLJESLCIYOOTKNPG4SY5}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
We introduce RefDecoder, a reference-conditioned video VAE decoder by injecting high-fidelity reference image signal directly into the decoding process via reference attention... achieving up to +2.1dB PSNR over the unconditional baselines on the Inter4K, WebVid, and Large Motion reconstruction benchmarks.
That equal conditioning of the decoder via reference attention is sufficient to preserve structural integrity without introducing new artifacts or requiring any fine-tuning of the rest of the pipeline.
RefDecoder adds reference-image conditioning to video VAE decoders through attention, yielding up to 2.1 dB PSNR gains and better consistency on I2V, editing, and style-transfer tasks.
Formal links
Receipt and verification
| First computed | 2026-05-17T21:40:25.018212Z |
|---|---|
| Last reissued | 2026-05-17T21:57:18.416489Z |
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | unsigned_v0 |
| Schema | pith-number/v1.0 |
Canonical hash
dd85973dab4924b1230e74d4d79b92c766ae3cc8f3c755bb50c38597b8272a8d
Aliases
· · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/3WCZOPNLJESLCIYOOTKNPG4SY5 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: dd85973dab4924b1230e74d4d79b92c766ae3cc8f3c755bb50c38597b8272a8d
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "df03884989f4ed1d6106661a2fcf979c3db31a6fc5086a6be40cf1ae8869b776",
"cross_cats_sorted": [
"cs.LG"
],
"license": "http://creativecommons.org/licenses/by-sa/4.0/",
"primary_cat": "cs.CV",
"submitted_at": "2026-05-14T17:59:52Z",
"title_canon_sha256": "877ed8b8f595c87067d2944d4d25e4feb3db485a9683b647abc3b1396daad233"
},
"schema_version": "1.0",
"source": {
"id": "2605.15196",
"kind": "arxiv",
"version": 1
}
}