MedConclusion is a 5.7M-instance benchmark dataset for generating biomedical conclusions from structured PubMed abstracts, with LLM evaluations showing conclusion writing differs from summarization and that judge choice affects scores.
Inferring which medical treatments work from reports of clinical trials
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
MedConclusion: A Benchmark for Biomedical Conclusion Generation from Structured Abstracts
MedConclusion is a 5.7M-instance benchmark dataset for generating biomedical conclusions from structured PubMed abstracts, with LLM evaluations showing conclusion writing differs from summarization and that judge choice affects scores.