pith:JP4GN6WR
Adding Conditional Control to Text-to-Image Diffusion Models
ControlNet adds spatial controls like edges, depth, and human poses to pretrained text-to-image diffusion models.
arxiv:2302.05543 v3 · 2023-02-10 · cs.CV · cs.AI · cs.GR · cs.HC · cs.MM
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{JP4GN6WR2H4HK2JEYMBDM2W7JL}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
We present ControlNet, a neural network architecture to add spatial conditioning controls to large, pretrained text-to-image diffusion models. ControlNet locks the production-ready large diffusion models, and reuses their deep and robust encoding layers pretrained with billions of images as a strong backbone to learn a diverse set of conditional controls.
The zero convolutions progressively grow parameters from zero and ensure that no harmful noise could affect the finetuning, allowing the pretrained backbone to remain intact while learning new controls.
ControlNet adds spatial conditioning controls to pretrained text-to-image diffusion models via zero convolutions for stable fine-tuning on small or large datasets.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:46.350422Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
4bf866fad1d1f8756924c302366adf4ad0805b0a40be1eb3b7ca51069681aa14
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/JP4GN6WR2H4HK2JEYMBDM2W7JL \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 4bf866fad1d1f8756924c302366adf4ad0805b0a40be1eb3b7ca51069681aa14
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "354cfba2220db73fe22bf3e76b365043fbef9ec456ed18ece6e21b9722bac2d5",
"cross_cats_sorted": [
"cs.AI",
"cs.GR",
"cs.HC",
"cs.MM"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CV",
"submitted_at": "2023-02-10T23:12:37Z",
"title_canon_sha256": "c802ad03716e6f3260d24a90dcef2626bcb68f2f1371aaab7f3fa8d320eb8edb"
},
"schema_version": "1.0",
"source": {
"id": "2302.05543",
"kind": "arxiv",
"version": 3
}
}