CodeClinic benchmark demonstrates that LLM-generated Python skill libraries from clinical guidelines enhance consistency and reduce token consumption by up to 40% compared to zero-shot approaches on MIMIC-IV based tasks.
arXiv preprint arXiv:2406.05646 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
representative citing papers
Gymnasium establishes a standardized API for RL environments to improve interoperability, reproducibility, and ease of development in reinforcement learning.
citing papers explorer
-
CodeClinic: Evaluating Automation of Coding Skills for Clinical Reasoning Agents
CodeClinic benchmark demonstrates that LLM-generated Python skill libraries from clinical guidelines enhance consistency and reduce token consumption by up to 40% compared to zero-shot approaches on MIMIC-IV based tasks.
-
Gymnasium: A Standard Interface for Reinforcement Learning Environments
Gymnasium establishes a standardized API for RL environments to improve interoperability, reproducibility, and ease of development in reinforcement learning.