pith. sign in
Pith Number

pith:ML5KRKZ2

pith:2024:ML5KRKZ2U3SKXDRGO4CBBASGI4
not attested not anchored not stored refs resolved

Capabilities of Gemini Models in Medicine

Aishwarya Kamath, Alan Karthikesalingam, Albert Webson, Anil Palepu, Basil Mustafa, Ben Caine, Bradley Green, Cathy Cheung, Charles Lau, Christopher Semturs, Chunjong Park, Claire Cui, Dale Webster, Daniel McDuff, David G.T. Barrett, David Stutz, Demis Hassabis, Ehud Rivlin, Elahe Vedadi, Ellery Wulczyn, Ewa Dominowska, Fan Zhang, Greg Corrado, James Manyika, Jan Freyberg, Jean-Baptiste Alayrac, Jeff Dean, Jeremy Lai, Jesper Anderson, Jian Lu, Joelle Barral, Jonas Kemp, Jonathan Krause, Jonathon Shlens, Juanma Zambrano Chaves, Juraj Gottweis, Katherine Chou, Kavita Kulkarni, Khaled Saab, Kimberly Kanada, Koray Kavukcuoglu, Le Hou, Luheng He, Luyang Liu, Melvin Johnson, Mike Schaekermann, Natasha Latysheva, Neil Houlsby, Nenad Tomasev, Oriol Vinyals, Philip Mansfield, Renee Wong, Ruoxi Sun, Ryutaro Tanno, Shekoofeh Azizi, Siamak Shakeri, SiWai Man, S. M. Ali Eslami, S. Sara Mahdavi, Szu-Yeu Hu, Tao Tu, Tim Strother, Tomer Golany, Vivek Natarajan, Wei-Hung Weng, Yong Cheng, Yossi Matias

Med-Gemini models reach 91.1 percent accuracy on USMLE medical questions and surpass GPT-4 on medical benchmarks.

arxiv:2404.18416 v2 · 2024-04-29 · cs.AI · cs.CL · cs.CV · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{ML5KRKZ2U3SKXDRGO4CBBASGI4}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Our best-performing Med-Gemini model achieves SoTA performance of 91.1% accuracy on MedQA (USMLE) using a novel uncertainty-guided search strategy, surpasses the GPT-4 model family on every benchmark where direct comparison is viable, and improves over GPT-4V by an average relative margin of 44.5% on 7 multimodal benchmarks.

C2weakest assumption

That benchmark accuracy on curated medical datasets (MedQA, NEJM Image Challenges, MMMU health subset, etc.) will translate to reliable performance and safety in real clinical workflows with noisy, incomplete, or out-of-distribution patient data.

C3one line summary

Med-Gemini sets new records on 10 of 14 medical benchmarks including 91.1% on MedQA-USMLE, beats GPT-4V by 44.5% on multimodal tasks, and surpasses humans on medical text summarization.

References

269 extracted · 269 resolved · 16 Pith anchors

[1] M. D. Abr \`a moff, M. E. Tarver, N. Loyo-Berrios, S. Trujillo, D. Char, Z. Obermeyer, M. B. Eydelman, F. P. of Ophthalmic Imaging, D. Algorithmic Interpretation Working Group of the Collaborative Com 2023
[2] GPT-4 Technical Report 2023 · arXiv:2303.08774
[3] J.-B. Alayrac, J. Donahue, P. Luc, A. Miech, I. Barr, Y. Hasson, K. Lenc, A. Mensch, K. Millican, M. Reynolds, et al. Flamingo: a visual language model for few-shot learning. Advances in neural inform 2022
[4] PaLM 2 Technical Report 2023 · arXiv:2305.10403
[5] F. Antaki, D. Milad, M. A. Chia, C.- \'E . Gigu \`e re, S. Touma, J. El-Khoury, P. A. Keane, and R. Duval. Capabilities of GPT-4 in ophthalmology: an analysis of model entropy and progress towards hum 2023

Cited by

31 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:50.766478Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

62faa8ab3aa6e4ab8e2677041082464732608b331e66262eae1e44c5fcb3f97a

Aliases

arxiv: 2404.18416 · arxiv_version: 2404.18416v2 · doi: 10.48550/arxiv.2404.18416 · pith_short_12: ML5KRKZ2U3SK · pith_short_16: ML5KRKZ2U3SKXDRG · pith_short_8: ML5KRKZ2
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/ML5KRKZ2U3SKXDRGO4CBBASGI4 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 62faa8ab3aa6e4ab8e2677041082464732608b331e66262eae1e44c5fcb3f97a
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "caee117c34f877923c1935488b44d1956325b13f0bab70c31f5478b87fe76cc7",
    "cross_cats_sorted": [
      "cs.CL",
      "cs.CV",
      "cs.LG"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2024-04-29T04:11:28Z",
    "title_canon_sha256": "cb50dbe911bb83a2d23526805ba180095681824fbf212e77b5fbe93c1962eff6"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2404.18416",
    "kind": "arxiv",
    "version": 2
  }
}