Gemini 2.x: Pushing the frontier with advanced reasoning, multimodality, long context, and next-generation agentic capabilities

Gemini Team, Google · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Measuring Competency, Not Performance: Item-Aware Evaluation Across Medical Benchmarks

cs.CL · 2025-09-29 · conditional · novelty 5.0

MedIRT applies Item Response Theory to medical LLM benchmarks to separate latent competency from item difficulty and discrimination, producing more stable rankings and revealing domain heterogeneity than accuracy alone.

citing papers explorer

Showing 1 of 1 citing paper.

Measuring Competency, Not Performance: Item-Aware Evaluation Across Medical Benchmarks cs.CL · 2025-09-29 · conditional · none · ref 6
MedIRT applies Item Response Theory to medical LLM benchmarks to separate latent competency from item difficulty and discrimination, producing more stable rankings and revealing domain heterogeneity than accuracy alone.

Gemini 2.x: Pushing the frontier with advanced reasoning, multimodality, long context, and next-generation agentic capabilities

fields

years

verdicts

representative citing papers

citing papers explorer