pith. sign in
Pith Number

pith:IUXYGNYW

pith:2023:IUXYGNYWHVZMCDFSGC3FMPRMM7
not attested not anchored not stored refs pending

Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models

Fahad Shahbaz Khan, Hanoona Rasheed, Muhammad Maaz, Salman Khan

Video-ChatGPT combines a video-adapted visual encoder with a large language model to support detailed conversations about video content.

arxiv:2306.05424 v2 · 2023-06-08 · cs.CV

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{IUXYGNYWHVZMCDFSGC3FMPRMM7}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

The resulting model is capable of understanding and generating detailed conversations about videos.

C2weakest assumption

The semi-automated pipeline for creating the 100,000 video-instruction pairs produces sufficiently clean training data without label noise that would degrade the model's ability to generate accurate conversations.

C3one line summary

Video-ChatGPT is a multimodal model that combines a video visual encoder with an LLM to understand and generate conversations about videos, trained on a new dataset of 100,000 video-instruction pairs.

Formal links

3 machine-checked theorem links

Cited by

56 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:53.670465Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

452f8337163d72c10cb230b6563e2c67d6aee88b03595c40c0d6f3600844b886

Aliases

arxiv: 2306.05424 · arxiv_version: 2306.05424v2 · doi: 10.48550/arxiv.2306.05424 · pith_short_12: IUXYGNYWHVZM · pith_short_16: IUXYGNYWHVZMCDFS · pith_short_8: IUXYGNYW
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/IUXYGNYWHVZMCDFSGC3FMPRMM7 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 452f8337163d72c10cb230b6563e2c67d6aee88b03595c40c0d6f3600844b886
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "5cb9fcb7415928f94a879a2872bd1094e96ff9a4802d49f2bb84d86f1481e1ed",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2023-06-08T17:59:56Z",
    "title_canon_sha256": "469c0d15adbd96a58fa5881c8367f13e8fde0ef3d902a0e1c26f3d1b7edc308a"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2306.05424",
    "kind": "arxiv",
    "version": 2
  }
}