{"paper":{"title":"GraphWalker: Patient Analogy Meets Information Gain for Clinical Reasoning with Large Language Models","license":"http://creativecommons.org/licenses/by-nc-sa/4.0/","headline":"GraphWalker selects in-context demonstrations for EHR reasoning by building graphs that combine patient clinical data with LLM estimates of information gain, then using cohort discovery and lazy greedy search to reduce redundancy and local-","cross_cats":[],"primary_cat":"cs.LG","authors_text":"Hongxin Ding, Jiaran Gao, Jinyang Zhang, Junfeng Zhao, Liantao Ma, Weibin Liao, Xinke Jiang, Yasha Wang, Yue Fang, Yuxin Guo, Zhibang Yang","submitted_at":"2026-04-08T04:59:49Z","abstract_excerpt":"Clinical reasoning over electronic health records (EHRs) is a fundamental yet challenging task in modern healthcare. While large language models (LLMs) offer a promising paradigm via in-context demonstrations that requires no task-specific parameter updates, existing methods for reasoning by patient analogy in EHR settings suffer from three core limitations: (1) Perspective Limitation, where data-driven similarity misaligns with LLM reasoning needs while model-driven signals are constrained by limited clinical competence; (2) Cohort Awareness, as demonstrations are selected independently witho"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"GraphWalker consistently outperforms state-of-the-art ICL baselines, yielding substantial improvements in clinical reasoning performance.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That jointly modeling patient clinical information and LLM-estimated information gain via graphs, combined with cohort discovery and lazy greedy search, will reliably overcome perspective limitation, cohort awareness, and information aggregation issues on real EHR data without introducing new selection biases or overfitting to the tested benchmarks.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"GraphWalker improves LLM clinical reasoning on EHRs by graph-guided selection of in-context examples that jointly uses data similarity and model signals, plus cohort-level structure and greedy aggregation to reduce redundancy.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"GraphWalker selects in-context demonstrations for EHR reasoning by building graphs that combine patient clinical data with LLM estimates of information gain, then using cohort discovery and lazy greedy search to reduce redundancy and local-","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"ba0b09bc7ed25b9b07781d14e3c0809a2bdb15e5969e288ad1b8091695d0f135"},"source":{"id":"2604.06684","kind":"arxiv","version":2},"verdict":{"id":"ab533593-a22d-423d-909e-2fdd1e69492f","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-10T18:49:10.701714Z","strongest_claim":"GraphWalker consistently outperforms state-of-the-art ICL baselines, yielding substantial improvements in clinical reasoning performance.","one_line_summary":"GraphWalker improves LLM clinical reasoning on EHRs by graph-guided selection of in-context examples that jointly uses data similarity and model signals, plus cohort-level structure and greedy aggregation to reduce redundancy.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That jointly modeling patient clinical information and LLM-estimated information gain via graphs, combined with cohort discovery and lazy greedy search, will reliably overcome perspective limitation, cohort awareness, and information aggregation issues on real EHR data without introducing new selection biases or overfitting to the tested benchmarks.","pith_extraction_headline":"GraphWalker selects in-context demonstrations for EHR reasoning by building graphs that combine patient clinical data with LLM estimates of information gain, then using cohort discovery and lazy greedy search to reduce redundancy and local-"},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2604.06684/integrity.json","findings":[],"available":true,"detectors_run":[],"snapshot_sha256":"c28c3603d3b5d939e8dc4c7e95fa8dfce3d595e45f758748cecf8e644a296938"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":2,"snapshot_sha256":"b478ee49f7359113e900860cf7e1214bc73f670edbb2c64e848afb35808a46fb"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}