pith:NUBOPVYY
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
MiniGPT-v2 uses unique task identifiers to let one large language model handle many vision-language tasks at once.
arxiv:2310.09478 v3 · 2023-10-14 · cs.CV
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{NUBOPVYYSPN4AUFUSGYPOFH2MT}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
After the three-stage training, the experimental results show that MiniGPT-v2 achieves strong performance on many visual question-answering and visual grounding benchmarks compared to other vision-language generalist models.
That assigning unique identifiers to tasks will let the model distinguish instructions and learn each task more efficiently without task interference or negative transfer, an assumption stated in the abstract but not quantified or ablated in the provided text.
MiniGPT-v2 adds unique task identifiers to a large language model so one system can perform image description, visual question answering, and visual grounding after three-stage training.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:48.724066Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
6d02e7d71893dbc050b491b0f714fa64ca0a86b45572b70df0b505457b86a0b5
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/NUBOPVYYSPN4AUFUSGYPOFH2MT \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 6d02e7d71893dbc050b491b0f714fa64ca0a86b45572b70df0b505457b86a0b5
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "82576f7fa689351de5097ab015f4ba825f9c631338f4e910f1f3f7c8aa7cdc53",
"cross_cats_sorted": [],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.CV",
"submitted_at": "2023-10-14T03:22:07Z",
"title_canon_sha256": "1c8704a30ff6ec013f87f8385db67046decac01ecfb8bc5d6c6e8d8df56936a2"
},
"schema_version": "1.0",
"source": {
"id": "2310.09478",
"kind": "arxiv",
"version": 3
}
}