pith:EXLCQCNE
Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
Visual ChatGPT lets users chat with images by linking ChatGPT to visual foundation models through prompts.
arxiv:2303.04671 v1 · 2023-03-08 · cs.CV
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{EXLCQCNEKLTKZELHJSQHJDQ5YC}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
We build a system called Visual ChatGPT, incorporating different Visual Foundation Models, to enable the user to interact with ChatGPT by 1) sending and receiving not only languages but also images 2) providing complex visual questions or visual editing instructions that require the collaboration of multiple AI models with multi-steps.
That prompt-based injection of visual model capabilities into ChatGPT enables reliable multi-step collaboration without frequent errors in task decomposition or model selection.
Visual ChatGPT integrates visual foundation models with ChatGPT via prompts to enable multi-step image understanding, generation, and editing in conversational interactions.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-18T04:00:28.484742Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
25d62809a452e6ac91674ca0748e1dc094b385ad72923e3828febcfba8cf321f
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/EXLCQCNEKLTKZELHJSQHJDQ5YC \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 25d62809a452e6ac91674ca0748e1dc094b385ad72923e3828febcfba8cf321f
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "eee4f9578aa0ad14064cdf31d8f7c34541abf7367a11f22bab2cab92b014d4f3",
"cross_cats_sorted": [],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CV",
"submitted_at": "2023-03-08T15:50:02Z",
"title_canon_sha256": "990bdc7e3e647b036962fbb157f950c841665d609e987e7d95d9fedb63867a44"
},
"schema_version": "1.0",
"source": {
"id": "2303.04671",
"kind": "arxiv",
"version": 1
}
}