pith:AT7SNXJ2
Prompt-to-Prompt Image Editing with Cross Attention Control
Cross-attention layers let users edit images by changing only the text prompt.
arxiv:2208.01626 v1 · 2022-08-02 · cs.CV · cs.CL · cs.GR · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{AT7SNXJ2YUBDO47YGBKZXAWQHS}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
the cross-attention layers are the key to controlling the relation between the spatial layout of the image to each word in the prompt. With this observation, we present several applications which monitor the image synthesis by editing the textual prompt only.
That the cross-attention mechanism is the dominant and controllable factor for spatial word-to-region mapping in the underlying generative model, and that targeted edits to these maps during inference will not introduce artifacts or require model retraining.
Cross-attention control in text-conditioned models enables localized and global image edits by editing only the input text prompt.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-07-05T04:45:35.578343Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
04ff26dd3ac5023773f830559b82d03cbb9b046433b0bcf6f9402b0a74893087
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/AT7SNXJ2YUBDO47YGBKZXAWQHS \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 04ff26dd3ac5023773f830559b82d03cbb9b046433b0bcf6f9402b0a74893087
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "21a051649f91096729f99e9a3accb638b7af913634eb3aa79930b28f7a40f2a6",
"cross_cats_sorted": [
"cs.CL",
"cs.GR",
"cs.LG"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CV",
"submitted_at": "2022-08-02T17:55:41Z",
"title_canon_sha256": "6f045d47f77f9d62c080c9273555e654308291b07760a0742e4f3abcf0504773"
},
"schema_version": "1.0",
"source": {
"id": "2208.01626",
"kind": "arxiv",
"version": 1
}
}