pith. sign in
Pith Number

pith:3JQ23PTR

pith:2024:3JQ23PTRKKM6QP4AAHQTDCNUMY
not attested not anchored not stored refs resolved

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Benyou Wang, Jianye Hou, Junying Chen, Ke Ji, Rongsheng Wang, Wanlong Liu, Xidong Wang, Zhenyang Cai

HuatuoGPT-o1 reaches complex medical reasoning through verifier-guided training on 40,000 problems.

arxiv:2412.18925 v1 · 2024-12-25 · cs.CL · cs.AI · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{3JQ23PTRKKM6QP4AAHQTDCNUMY}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

HuatuoGPT-o1, trained with a two-stage approach of verifier-guided search for fine-tuning followed by RL with verifier rewards on only 40K verifiable medical problems, outperforms both general and medical-specific baselines.

C2weakest assumption

A medical verifier can reliably and automatically determine the correctness of complex, multi-step reasoning outputs in medicine, despite the abstract noting that verifying medical reasoning is inherently challenging unlike in mathematics.

C3one line summary

HuatuoGPT-o1 achieves superior medical complex reasoning by using a verifier to curate reasoning trajectories for fine-tuning and then applying RL with verifier-based rewards.

References

101 extracted · 101 resolved · 17 Pith anchors

[1] Melody Y . Guan, Manas Joglekar, Eric Wallace, Saachi Jain, Boaz Barak, Alec Heylar, Rachel Dias, Andrea Vallone, Hongyu Ren, Jason Wei, Hyung Won Chung, Sam Toyer, Johannes Heidecke, Alex Beutel, and 2024
[2] A preliminary study of o1 in medicine: Are we closer to an ai doctor? arXiv preprint arXiv:2409.15277 2024 2024
[3] Evaluation of openai o1: Opportunities and challenges of agi 2024
[4] Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen, Ning Ding, Ganqu Cui, Zheni Zeng, Yufei Huang, Chaojun Xiao, Chi Han, et al 2024
[5] Scaling of search and learning: A roadmap to reproduce o1 from reinforcement learning perspective 2024

Formal links

1 machine-checked theorem link

Cited by

33 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:52.539220Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

da61adbe715299e83f8001e13189b466283e97db6d2daa6af7e1278d03481ae3

Aliases

arxiv: 2412.18925 · arxiv_version: 2412.18925v1 · doi: 10.48550/arxiv.2412.18925 · pith_short_12: 3JQ23PTRKKM6 · pith_short_16: 3JQ23PTRKKM6QP4A · pith_short_8: 3JQ23PTR
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/3JQ23PTRKKM6QP4AAHQTDCNUMY \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: da61adbe715299e83f8001e13189b466283e97db6d2daa6af7e1278d03481ae3
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "c46f04f60a11f5dcd4b5def012472fa6b68dbba2a111c6289943dd5677855648",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.LG"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2024-12-25T15:12:34Z",
    "title_canon_sha256": "54548b45310d2363c1c0194fb07e47578af818bd72ebe2cb5374f6f1e47c4354"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2412.18925",
    "kind": "arxiv",
    "version": 1
  }
}