pith. sign in
Pith Number

pith:BSD73E6C

pith:2026:BSD73E6C54RXWSJQ56JMW63TNQ
not attested not anchored not stored refs resolved

Minerva-Ego: Spatiotemporal Hints for Egocentric Video Understanding

Arsha Nagrani, Bo Hu, Cordelia Schmid, David A Ross, Jasper Uijilings, Ramin Mehran, Shyamal Buch, Sudheendra Vijayanarasimhan, Tobias Weyand

Providing models with hints on where and when to look improves performance on complex egocentric video reasoning tasks.

arxiv:2605.15342 v1 · 2026-05-14 · cs.CV · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{BSD73E6C54RXWSJQ56JMW63TNQ}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Prompting frontier models with hints of 'where' and 'when' to look yields substantial improvements in performance.

C2weakest assumption

The spatiotemporally-dense human-annotated reasoning traces and object masks accurately capture the intermediate steps and visual elements required to solve the multi-step questions.

C3one line summary

Minerva-Ego is a new benchmark for egocentric visual reasoning with dense human-annotated traces and masks, showing that spatiotemporal hints substantially improve frontier model performance.

References

66 extracted · 66 resolved · 10 Pith anchors

[1] GPT-4 Technical Report 2023 · arXiv:2303.08774
[2] https://openai.com/index/learning- to-reason-with-llms , 2025 2025
[3] System Card: Claude Opus 4 & Claude Sonnet 4 2025
[4] Claude 3.5 sonnet v2 2023
[5] In- finiBench: A comprehensive benchmark for large multimodal models in very long video understanding.arXiv preprint arXiv:2406.19875, 2024 2024
Receipt and verification
First computed 2026-05-20T00:00:53.468886Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

0c87fd93c2ef237b4930ef92cb7b736c3de1530cdd3f91a2d6a5149bd111338f

Aliases

arxiv: 2605.15342 · arxiv_version: 2605.15342v1 · doi: 10.48550/arxiv.2605.15342 · pith_short_12: BSD73E6C54RX · pith_short_16: BSD73E6C54RXWSJQ · pith_short_8: BSD73E6C
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/BSD73E6C54RXWSJQ56JMW63TNQ \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 0c87fd93c2ef237b4930ef92cb7b736c3de1530cdd3f91a2d6a5149bd111338f
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "52b816aa18674263f1c32c9aefb0af114f5f3efe9a12b3f76aaa98495ba5e9d6",
    "cross_cats_sorted": [
      "cs.LG"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2026-05-14T19:12:20Z",
    "title_canon_sha256": "58930be17e6972768f6b19177c4427f1535b66aae4840693dc3cee6c2099e079"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.15342",
    "kind": "arxiv",
    "version": 1
  }
}