Pith Number

pith:ML5KRKZ2

pith:2024:ML5KRKZ2U3SKXDRGO4CBBASGI4

not attested not anchored not stored refs resolved

Capabilities of Gemini Models in Medicine

Aishwarya Kamath, Alan Karthikesalingam, Albert Webson, Anil Palepu, Basil Mustafa, Ben Caine, Bradley Green, Cathy Cheung, Charles Lau, Christopher Semturs, Chunjong Park, Claire Cui, Dale Webster, Daniel McDuff, David G.T. Barrett, David Stutz, Demis Hassabis, Ehud Rivlin, Elahe Vedadi, Ellery Wulczyn, Ewa Dominowska, Fan Zhang, Greg Corrado, James Manyika, Jan Freyberg, Jean-Baptiste Alayrac, Jeff Dean, Jeremy Lai, Jesper Anderson, Jian Lu, Joelle Barral, Jonas Kemp, Jonathan Krause, Jonathon Shlens, Juanma Zambrano Chaves, Juraj Gottweis, Katherine Chou, Kavita Kulkarni, Khaled Saab, Kimberly Kanada, Koray Kavukcuoglu, Le Hou, Luheng He, Luyang Liu, Melvin Johnson, Mike Schaekermann, Natasha Latysheva, Neil Houlsby, Nenad Tomasev, Oriol Vinyals, Philip Mansfield, Renee Wong, Ruoxi Sun, Ryutaro Tanno, Shekoofeh Azizi, Siamak Shakeri, SiWai Man, S. M. Ali Eslami, S. Sara Mahdavi, Szu-Yeu Hu, Tao Tu, Tim Strother, Tomer Golany, Vivek Natarajan, Wei-Hung Weng, Yong Cheng, Yossi Matias

Med-Gemini models reach 91.1 percent accuracy on USMLE medical questions and surpass GPT-4 on medical benchmarks.

arxiv:2404.18416 v2 · 2024-04-29 · cs.AI · cs.CL · cs.CV · cs.LG

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{ML5KRKZ2U3SKXDRGO4CBBASGI4}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Our best-performing Med-Gemini model achieves SoTA performance of 91.1% accuracy on MedQA (USMLE) using a novel uncertainty-guided search strategy, surpasses the GPT-4 model family on every benchmark where direct comparison is viable, and improves over GPT-4V by an average relative margin of 44.5% on 7 multimodal benchmarks.

C2weakest assumption

That benchmark accuracy on curated medical datasets (MedQA, NEJM Image Challenges, MMMU health subset, etc.) will translate to reliable performance and safety in real clinical workflows with noisy, incomplete, or out-of-distribution patient data.

C3one line summary

Med-Gemini sets new records on 10 of 14 medical benchmarks including 91.1% on MedQA-USMLE, beats GPT-4V by 44.5% on multimodal tasks, and surpasses humans on medical text summarization.

References

269 extracted · 269 resolved · 16 Pith anchors

[1] M. D. Abr \`a moff, M. E. Tarver, N. Loyo-Berrios, S. Trujillo, D. Char, Z. Obermeyer, M. B. Eydelman, F. P. of Ophthalmic Imaging, D. Algorithmic Interpretation Working Group of the Collaborative Com 2023

[2] GPT-4 Technical Report 2023 · arXiv:2303.08774

[3] J.-B. Alayrac, J. Donahue, P. Luc, A. Miech, I. Barr, Y. Hasson, K. Lenc, A. Mensch, K. Millican, M. Reynolds, et al. Flamingo: a visual language model for few-shot learning. Advances in neural inform 2022

[4] PaLM 2 Technical Report 2023 · arXiv:2305.10403

[5] F. Antaki, D. Milad, M. A. Chia, C.- \'E . Gigu \`e re, S. Touma, J. El-Khoury, P. A. Keane, and R. Duval. Capabilities of GPT-4 in ophthalmology: an analysis of model entropy and progress towards hum 2023

Cited by

31 papers in Pith

DDX-TRACE: A Benchmark for Medical Diagnostic Trajectories in VLMs

Data-Centric Foundation Models in Computational Healthcare: A Survey

NVILA: Efficient Frontier Visual Language Models

Image-to-Text for Medical Reports Using Adaptive Co-Attention and Triple-LSTM Module

QM-ToT: A Medical Tree of Thoughts Reasoning Framework for Quantized Model

Receipt and verification

First computed	2026-05-17T23:38:50.766478Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

62faa8ab3aa6e4ab8e2677041082464732608b331e66262eae1e44c5fcb3f97a

Aliases

arxiv: 2404.18416 · arxiv_version: 2404.18416v2 · doi: 10.48550/arxiv.2404.18416 · pith_short_12: ML5KRKZ2U3SK · pith_short_16: ML5KRKZ2U3SKXDRG · pith_short_8: ML5KRKZ2

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/ML5KRKZ2U3SKXDRGO4CBBASGI4 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 62faa8ab3aa6e4ab8e2677041082464732608b331e66262eae1e44c5fcb3f97a

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "caee117c34f877923c1935488b44d1956325b13f0bab70c31f5478b87fe76cc7",
    "cross_cats_sorted": [
      "cs.CL",
      "cs.CV",
      "cs.LG"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2024-04-29T04:11:28Z",
    "title_canon_sha256": "cb50dbe911bb83a2d23526805ba180095681824fbf212e77b5fbe93c1962eff6"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2404.18416",
    "kind": "arxiv",
    "version": 2
  }
}