MURPHY improves code generation pass rates by up to 6% through retrospective credit assignment on multi-turn feedback trees using max or mean reward propagation.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
SAVeR adds self-auditing of internal beliefs in LLM agents via persona-based candidates and constraint-guided repairs, improving faithfulness on six benchmarks without hurting task performance.
Infherno deploys LLM agents with code execution and terminology tools to synthesize FHIR resources from unstructured clinical notes, matching human baseline performance on synthetic and real datasets.
citing papers explorer
-
MURPHY: Feedback-Aware GRPO with Retrospective Credit Assignment for Multi-Turn Code Generation
MURPHY improves code generation pass rates by up to 6% through retrospective credit assignment on multi-turn feedback trees using max or mean reward propagation.
-
Verify Before You Commit: Towards Faithful Reasoning in LLM Agents via Self-Auditing
SAVeR adds self-auditing of internal beliefs in LLM agents via persona-based candidates and constraint-guided repairs, improving faithfulness on six benchmarks without hurting task performance.
-
Infherno: End-to-end Agent-based FHIR Resource Synthesis from Free-form Clinical Notes
Infherno deploys LLM agents with code execution and terminology tools to synthesize FHIR resources from unstructured clinical notes, matching human baseline performance on synthetic and real datasets.