pith:XYRW52VE
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
Pretrained models can be composed zero-shot through multimodal prompting to exchange information and gain new multimodal capabilities without finetuning.
arxiv:2204.00598 v2 · 2022-04-01 · cs.CV · cs.AI · cs.CL · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{XYRW52VEY4L236AVZCPU5VWAUR}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
multiple pretrained models may be composed zero-shot i.e., via multimodal-informed prompting, to exchange information with each other and capture new multimodal capabilities, without requiring finetuning
That distinct capabilities stored in separately trained foundation models can be reliably accessed and combined through prompting alone, without finetuning or task-specific adaptation that would break the zero-shot property.
Socratic Models compose zero-shot multimodal reasoning by prompting pretrained language and vision models to exchange information and enable new capabilities without finetuning.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:48.286458Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
be236eeaa4c717adf815c89f4ed6c0a4718b32bca204501fa8988d1d841daea7
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/XYRW52VEY4L236AVZCPU5VWAUR \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: be236eeaa4c717adf815c89f4ed6c0a4718b32bca204501fa8988d1d841daea7
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "577a8a3b32c6a40980cc97108847a1f07fcfe408cc83e3df4277ff16066621c8",
"cross_cats_sorted": [
"cs.AI",
"cs.CL",
"cs.LG"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CV",
"submitted_at": "2022-04-01T17:43:13Z",
"title_canon_sha256": "6c6ab2fbb224c4add86a606e514200ffee80459270e2a6a79154866871bd8d58"
},
"schema_version": "1.0",
"source": {
"id": "2204.00598",
"kind": "arxiv",
"version": 2
}
}