pith. sign in

Designing disaggregated evaluations of ai systems: Choices, considera- tions, and tradeoffs

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.AI 1

years

2025 1

verdicts

UNVERDICTED 1

representative citing papers

Phi-4-reasoning Technical Report

cs.AI · 2025-04-30 · unverdicted · novelty 4.0

A 14B reasoning model trained via supervised fine-tuning on selected prompts and o3-mini traces, plus outcome RL, outperforms larger open models like DeepSeek-R1-Distill-Llama-70B on math, coding, planning and related benchmarks.

citing papers explorer

Showing 1 of 1 citing paper.

  • Phi-4-reasoning Technical Report cs.AI · 2025-04-30 · unverdicted · none · ref 12

    A 14B reasoning model trained via supervised fine-tuning on selected prompts and o3-mini traces, plus outcome RL, outperforms larger open models like DeepSeek-R1-Distill-Llama-70B on math, coding, planning and related benchmarks.