Releases mamabench (25,949 QA items from seven expert sources) and mamaretrieval (3,185 graded queries over 63,650 chunks) to evaluate RAG in maternal, neonatal, and reproductive health.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Contrastive Reflection identifies error-anchored slices in agent traces, adds contrastive successes, and uses a Teacher LLM to generate prompt edits that are accepted only if they improve validation performance, raising HotpotQA exact-match from 51.4% to 60.4%.
citing papers explorer
No citing papers match the current filters.