pith:Q5BPFQLC
Hi-GaTA: Hierarchical Gated Temporal Aggregation Adapter for Surgical Video Report Generation
A hierarchical adapter compresses long surgical videos into tokens that let language models generate accurate procedure reports.
arxiv:2605.11208 v2 · 2026-05-11 · cs.CV
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{Q5BPFQLCHFXRRDFC5AQWWHLOGQ}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Experiments show our approach achieves the best overall performance, with consistent gains over strong Multimodal Large Language Model (MLLM) baselines.
The 214 simulated surgical videos paired with surgeon-authored reports are sufficiently representative of real clinical procedures and that the Sur40k pretraining on public videos transfers without major domain shift to the target benchmark.
Hi-GaTA is a gated temporal pyramid adapter that aggregates multi-scale video features via text-conditioned cross-attention and gated fusion to enable LLM-based surgical report generation, backed by a new 214-video benchmark and Sur40k pretrained encoder.
Formal links
Receipt and verification
| First computed | 2026-05-20T00:03:17.398055Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
8742f2c162396f188ca2e8216b1d6e343a25d48a8f6c2113d03725524cbde68d
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/Q5BPFQLCHFXRRDFC5AQWWHLOGQ \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 8742f2c162396f188ca2e8216b1d6e343a25d48a8f6c2113d03725524cbde68d
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "018dd5e2d274d21743ee41e8226d3afc691ac1cd23f2def1a0b3a60204784ae0",
"cross_cats_sorted": [],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CV",
"submitted_at": "2026-05-11T20:21:34Z",
"title_canon_sha256": "7e824eaacba1115d6278db4c95621346a6e9cb78e486bc73bafe31da13b7a9ca"
},
"schema_version": "1.0",
"source": {
"id": "2605.11208",
"kind": "arxiv",
"version": 2
}
}