pith. sign in
Pith Number

pith:LLKKIOXE

pith:2026:LLKKIOXEJ6GLDOLV5J5M6EUH26
not attested not anchored not stored refs resolved

MMCL-Bench: Multimodal Context Learning from Visual Rules, Procedures, and Evidence

Fei Yin, Qingyan Bai, Yifan Chen, Yujiu Yang, Zicheng Lin

Current multimodal models solve fewer than one-third of tasks that require learning rules and procedures from visual examples.

arxiv:2605.12703 v1 · 2026-05-12 · cs.CV · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{LLKKIOXEJ6GLDOLV5J5M6EUH26}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

current systems remain far from robust multimodal context learning, with even the strongest model solving fewer than one-third of tasks under strict evaluation

C2weakest assumption

The 102 tasks and rubric-based scoring faithfully isolate multimodal context learning without introducing unintended biases in task selection or evaluation criteria.

C3one line summary

MMCL-Bench shows that even the strongest frontier multimodal models solve fewer than one-third of tasks requiring recovery and application of visual rules, procedures, and empirical patterns.

References

25 extracted · 25 resolved · 3 Pith anchors

[1] Cl-bench: A benchmark for context learning 2026
[2] CL-bench Team. CL-bench leaderboard.https://www.clbench.com/, 2026. Accessed April 4, 2026 2026
[3] LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding 2024 · arXiv:2308.14508
[4] Long- context llms struggle with long in-context learning.Computing Research Repository, abs/2404.02060 2024
[5] Mmlong- bench: Benchmarking long-context vision-language models effectively and thoroughly 2025
Receipt and verification
First computed 2026-05-18T03:09:49.656656Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

5ad4a43ae44f8cb1b975ea7acf1287d7ae0e3c8719c679e9811c6cd22ca90621

Aliases

arxiv: 2605.12703 · arxiv_version: 2605.12703v1 · doi: 10.48550/arxiv.2605.12703 · pith_short_12: LLKKIOXEJ6GL · pith_short_16: LLKKIOXEJ6GLDOLV · pith_short_8: LLKKIOXE
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/LLKKIOXEJ6GLDOLV5J5M6EUH26 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 5ad4a43ae44f8cb1b975ea7acf1287d7ae0e3c8719c679e9811c6cd22ca90621
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "557fa282993801dcca8e4cb76313c41065b849ea3b5e210f1729aa6a9e7ffc60",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2026-05-12T19:57:37Z",
    "title_canon_sha256": "328a1b69fe26d5903f4fb876edd09f204b796c67846000a7a963e21444ecf45e"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.12703",
    "kind": "arxiv",
    "version": 1
  }
}