arxiv: 2604.02678 · v1 · submitted 2026-04-03 · 📊 stat.ME · cs.AI· stat.AP

Recognition: 2 theorem links

· Lean Theorem

Eligibility-Aware Evidence Synthesis: An Agentic Framework for Clinical Trial Meta-Analysis

Yanxun Xu, Yao Zhao, Zhiyue Zhang

Authors on Pith no claims yet

Pith reviewed 2026-05-13 18:55 UTC · model grok-4.3

classification 📊 stat.ME cs.AIstat.AP

keywords meta-analysisclinical trialseligibility criteriaagentic frameworkevidence synthesisLLMrisk ratioprecision medicine

0 comments

The pith

EligMeta adjusts meta-analysis weights using eligibility criteria alignment between trials and target populations instead of statistical precision alone.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces EligMeta, an agentic framework that turns natural-language queries into reproducible trial selection and then incorporates eligibility alignment into study weighting for meta-analysis. Conventional methods weight studies only by precision and ignore differences in patient populations defined by eligibility criteria. EligMeta uses LLMs to generate interpretable rules and parse metadata while keeping all weighting, filtering, and statistical pooling deterministic for reproducibility. This produces cohort-specific pooled estimates that better match the populations of interest. The approach matters for precision medicine because it quantifies how eligibility differences affect evidence synthesis, as shown by a shift in the olaparib adverse-events risk ratio from 2.18 to 1.97.

Core claim

EligMeta translates natural-language queries into reproducible trial selection by generating interpretable rules with LLMs and performing schema-constrained parsing of trial metadata, then structures eligibility criteria to compute similarity-based study weights reflecting population alignment between target and comparator trials, leading to adjusted pooled estimates in meta-analysis.

What carries the argument

Eligibility-aware weighting that computes similarity-based study weights from structured eligibility criteria to reflect population alignment.

If this is right

Reduces 4,044 candidate trials to 39 clinically relevant studies in a gastric cancer landscape analysis while recovering all 13 guideline-cited trials.
Shifts the pooled risk ratio for olaparib adverse events across four trials from 2.18 (95% CI 1.71-2.79) under conventional Mantel-Haenszel estimation to 1.97 (95% CI 1.76-2.20).
Enables scalable, reproducible evidence synthesis that accounts for clinical compatibility in addition to statistical precision.
Produces cohort-specific pooled estimates rather than general ones that ignore population differences.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same hybrid architecture could be tested on non-oncology meta-analyses to check whether eligibility weighting changes conclusions in other disease areas.
Integration with existing guideline-development workflows could allow automatic flagging of trials whose eligibility criteria diverge from the target population.
Further validation against larger trial registries would show whether the reduction from thousands of candidates to dozens scales without loss of relevant studies.

Load-bearing premise

LLM-generated rules and schema-constrained parsing correctly identify all clinically relevant trials while similarity-based weights accurately capture population alignment without introducing new bias.

What would settle it

A side-by-side manual expert review of the same queries that selects a different set of trials or applies different weights and produces a statistically different pooled estimate from the EligMeta result.

Figures

Figures reproduced from arXiv: 2604.02678 by Yanxun Xu, Yao Zhao, Zhiyue Zhang.

**Figure 2.** Figure 2: Trial selection and structuring workflow. Left: A free-text clinical query is translated [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Eligibility-aware meta-analysis workflow. Left: Free-text eligibility criteria, together [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Stepwise filtering results for the gastric cancer use case. The diagram follows the six rules [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: Comparison of pooled risk ratio estimates for all-grade vomiting under classical precision [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

read the original abstract

Clinical evidence synthesis requires identifying relevant trials from large registries and aggregating results that account for population differences. While recent LLM-based approaches have automated components of systematic review, they do not support end-to-end evidence synthesis. Moreover, conventional meta-analysis weights studies by statistical precision without considering clinical compatibility reflected in eligibility criteria. We propose EligMeta, an agentic framework that integrates automated trial discovery with eligibility-aware meta-analysis, translating natural-language queries into reproducible trial selection and incorporating eligibility alignment into study weighting to produce cohort-specific pooled estimates. EligMeta employs a hybrid architecture separating LLM-based reasoning from deterministic execution: LLMs generate interpretable rules from natural-language queries and perform schema-constrained parsing of trial metadata, while all logical operations, weight computations, and statistical pooling are executed deterministically to ensure reproducibility. The framework structures eligibility criteria and computes similarity-based study weights reflecting population alignment between target and comparator trials. In a gastric cancer landscape analysis, EligMeta reduced 4,044 candidate trials to 39 clinically relevant studies through rule-based filtering, recovering all 13 guideline-cited trials. In an olaparib adverse events meta-analysis across four trials, eligibility-aware weighting shifted the pooled risk ratio from 2.18 (95% CI: 1.71-2.79) under conventional Mantel-Haenszel estimation to 1.97 (95% CI: 1.76-2.20), demonstrating quantifiable impact of incorporating eligibility alignment. EligMeta bridges automated trial discovery with eligibility-aware meta-analysis, providing a scalable and reproducible framework for evidence synthesis in precision medicine.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

EligMeta gives a clean hybrid setup for folding eligibility criteria into meta-analysis weights and shows measurable shifts in two examples, but the LLM parsing step has no validation so the numeric changes are hard to trust yet.

read the letter

The paper's core move is to run LLM-generated rules and schema parsing for trial selection and eligibility extraction, then feed similarity scores into deterministic study weights for the meta-analysis step. In the gastric cancer scan it narrows 4044 candidates to 39 studies while recovering every guideline-cited trial. In the olaparib adverse-events case the eligibility-aware weights move the pooled risk ratio from 2.18 (1.71-2.79) under Mantel-Haenszel to 1.97 (1.76-2.20). That is the concrete demonstration they offer.

Referee Report

3 major / 2 minor

Summary. The paper introduces EligMeta, a hybrid agentic framework for clinical trial meta-analysis that uses LLMs to translate natural-language queries into interpretable eligibility rules and perform schema-constrained parsing of trial metadata, while executing all weighting, similarity computations, and statistical pooling (e.g., Mantel-Haenszel) deterministically. It demonstrates the approach in two cases: filtering 4,044 gastric cancer trials down to 39 relevant studies while recovering all 13 guideline-cited trials, and reweighting an olaparib adverse-events meta-analysis across four trials to shift the pooled risk ratio from 2.18 (95% CI 1.71-2.79) under conventional estimation to 1.97 (95% CI 1.76-2.20) under eligibility-aware weights.

Significance. If the LLM-driven eligibility alignment step can be shown to produce weights that accurately reflect population compatibility without introducing new bias, the framework would offer a scalable, reproducible way to incorporate clinical eligibility criteria into meta-analytic weighting, moving beyond precision-only weights. The gastric-cancer filtering result and the concrete numeric shift in the olaparib example illustrate potential impact for precision-medicine evidence synthesis.

major comments (3)

[olaparib adverse events meta-analysis demonstration] The central demonstration (olaparib pooled RR moving from 2.18 [1.71-2.79] to 1.97 [1.76-2.20]) rests on the claim that LLM-generated rules and schema-constrained parsing produce similarity scores that correctly measure population alignment. No expert review, inter-rater reliability assessment, or comparison of the parsed eligibility criteria against manual gold-standard annotations is reported, so it remains possible that the narrower CI and point-estimate change arise from systematic LLM parsing artifacts rather than true eligibility alignment.
[methods for eligibility similarity computation] The eligibility similarity function and resulting study weights are presented as capturing population compatibility, yet the manuscript provides neither a sensitivity analysis (e.g., varying the similarity threshold or LLM prompt) nor a validation against independent expert-assigned weights. Without such checks, the quantitative impact attributed to eligibility-aware weighting cannot be isolated from potential biases in the LLM rule-generation step.
[framework architecture and reproducibility claims] Reproducibility is asserted via the hybrid architecture (LLM reasoning separated from deterministic execution), but no code, data, or parsed eligibility schemas are released, and no quantification of variability across repeated LLM calls is supplied. This leaves the reported trial-selection and weighting results difficult to verify independently.

minor comments (2)

The abstract and methods would benefit from an explicit equation or pseudocode for the eligibility similarity weight computation, including how the deterministic formula combines parsed criteria.
[gastric cancer landscape analysis] The gastric-cancer landscape analysis reports recovery of all 13 guideline-cited trials but does not state the total number of guideline-cited trials that existed in the registry or provide a confusion matrix for the rule-based filter.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their thoughtful review and valuable suggestions. We have carefully considered each major comment and provide point-by-point responses below. Where appropriate, we will revise the manuscript to incorporate additional analyses and materials to address the concerns raised.

read point-by-point responses

Referee: [olaparib adverse events meta-analysis demonstration] The central demonstration (olaparib pooled RR moving from 2.18 [1.71-2.79] to 1.97 [1.76-2.20]) rests on the claim that LLM-generated rules and schema-constrained parsing produce similarity scores that correctly measure population alignment. No expert review, inter-rater reliability assessment, or comparison of the parsed eligibility criteria against manual gold-standard annotations is reported, so it remains possible that the narrower CI and point-estimate change arise from systematic LLM parsing artifacts rather than true eligibility alignment.

Authors: We agree that expert validation would provide stronger evidence for the accuracy of the eligibility similarity scores. The current manuscript presents the olaparib example as a demonstration of the framework's potential impact rather than a definitive validation study. The hybrid architecture is designed to mitigate parsing artifacts by using schema-constrained extraction, which produces structured, inspectable criteria. In the revised manuscript, we will add a limitations section explicitly discussing the absence of expert review and the possibility of LLM-induced biases. We will also include a small-scale comparison of parsed criteria for the four trials against manual annotations by the authors to illustrate the process. revision: partial
Referee: [methods for eligibility similarity computation] The eligibility similarity function and resulting study weights are presented as capturing population compatibility, yet the manuscript provides neither a sensitivity analysis (e.g., varying the similarity threshold or LLM prompt) nor a validation against independent expert-assigned weights. Without such checks, the quantitative impact attributed to eligibility-aware weighting cannot be isolated from potential biases in the LLM rule-generation step.

Authors: We acknowledge the lack of sensitivity analyses in the original submission. To address this, we will perform and report sensitivity analyses by varying the similarity threshold (e.g., 0.5, 0.7, 0.9) and different LLM prompts for rule generation, showing the stability of the pooled estimates. For validation against expert-assigned weights, this would require additional resources and is beyond the scope of the current work; however, we will add it as a key direction for future research in the discussion. These additions will help isolate the effect of eligibility-aware weighting. revision: partial
Referee: [framework architecture and reproducibility claims] Reproducibility is asserted via the hybrid architecture (LLM reasoning separated from deterministic execution), but no code, data, or parsed eligibility schemas are released, and no quantification of variability across repeated LLM calls is supplied. This leaves the reported trial-selection and weighting results difficult to verify independently.

Authors: We take the reproducibility concern seriously. Although the manuscript emphasizes the hybrid design to promote reproducibility, we agree that open release of materials is necessary for independent verification. In the revised version, we will include a link to a public GitHub repository containing the full code, the parsed eligibility schemas for the case studies, and the trial metadata used. Additionally, we will quantify variability by repeating the LLM calls (e.g., 5 runs with temperature 0.1) for the olaparib and gastric cancer cases and report the range of similarity scores and resulting pooled estimates. revision: yes

Circularity Check

0 steps flagged

Eligibility similarity weights derived from external trial metadata via deterministic formulas; no reduction to fitted parameters or self-citation chains

full rationale

The paper's derivation chain separates LLM-based rule generation and schema-constrained parsing from deterministic weight computation and statistical pooling. Eligibility-aware weights are computed from parsed criteria using similarity measures applied to external trial metadata, then fed into standard Mantel-Haenszel pooling. The reported RR shift (2.18 to 1.97) is produced by these fixed operations rather than any equation that redefines the output in terms of the same fitted inputs. No self-citation is load-bearing for the central claim, and no ansatz or uniqueness theorem is smuggled in. The framework remains self-contained against external benchmarks, warranting only a minor score for possible incidental self-citation.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The framework rests on the assumption that LLM rule generation and eligibility similarity computation add value beyond standard methods; no explicit free parameters are named, but the similarity function itself is an implicit modeling choice.

free parameters (1)

eligibility similarity function parameters
The weighting scheme that converts eligibility overlap into study weights is not specified as parameter-free in the abstract.

axioms (1)

domain assumption LLM-generated rules from natural-language queries are sufficiently accurate and complete for trial filtering
The entire pipeline depends on this step to reduce 4044 trials to 39 without missing guideline-cited studies.

pith-pipeline@v0.9.0 · 5593 in / 1273 out tokens · 36170 ms · 2026-05-13T18:55:42.854620+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

eligibility-aware weighting ... penalty scores ... transformed into eligibility weights ... EW-MH estimator
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

hybrid architecture separating LLM-based reasoning from deterministic execution

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

35 extracted references · 35 canonical work pages

[1]

Systematic reviews: identifying relevant studies for systematic reviews.Bmj, 309(6964):1286–1291, 1994

Kay Dickersin, Roberta Scherer, and Carol Lefebvre. Systematic reviews: identifying relevant studies for systematic reviews.Bmj, 309(6964):1286–1291, 1994

work page 1994
[2]

Fixed-and random-effects models in meta-analysis.Psy- chological methods, 3(4):486, 1998

Larry V Hedges and Jack L Vevea. Fixed-and random-effects models in meta-analysis.Psy- chological methods, 3(4):486, 1998

work page 1998
[3]

Extending dersimonian and laird’s methodology to perform network meta-analyses with random inconsistency effects.Statistics in medicine, 35(6):819–839, 2016

Dan Jackson, Martin Law, Jessica K Barrett, Rebecca Turner, Julian PT Higgins, Georgia Salanti, and Ian R White. Extending dersimonian and laird’s methodology to perform network meta-analyses with random inconsistency effects.Statistics in medicine, 35(6):819–839, 2016

work page 2016
[4]

Joanna IntHout, John Ioannidis, and George F Borm. The hartung-knapp-sidik-jonkman method for random effects meta-analysis is straightforward and considerably outperforms the standard dersimonian-laird method.BMC medical research methodology, 14(1):25, 2014

work page 2014
[5]

Establishing the automatic identification of clinical trial cohorts from electronic health records by matching normalized eligibility criteria and patient clinical characteristics

K Lee, Y Mai, Zongzhi Liu, Kalpana Raja, Michelle K Higashi, T Jun, M Ma, T Wang, L Ai, E Calay, et al. Establishing the automatic identification of clinical trial cohorts from electronic health records by matching normalized eligibility criteria and patient clinical characteristics. medRxiv, pages 2024–02, 2024

work page 2024
[6]

Biobert: a pre-trained biomedical language representation model for biomedical text mining.Bioinformatics, 36(4):1234–1240, 2020

Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. Biobert: a pre-trained biomedical language representation model for biomedical text mining.Bioinformatics, 36(4):1234–1240, 2020

work page 2020
[7]

Biogpt: generative pre-trained transformer for biomedical text generation and mining.Brief- ings in bioinformatics, 23(6):bbac409, 2022

Renqian Luo, Liai Sun, Yingce Xia, Tao Qin, Sheng Zhang, Hoifung Poon, and Tie-Yan Liu. Biogpt: generative pre-trained transformer for biomedical text generation and mining.Brief- ings in bioinformatics, 23(6):bbac409, 2022

work page 2022
[8]

Alpapico: Extraction of pico frames from clinical trial documents using llms.Methods, 226:78–88, 2024

Madhusudan Ghosh, Shrimon Mukherjee, Asmit Ganguly, Partha Basuchowdhuri, Sudip Ku- mar Naskar, and Debasis Ganguly. Alpapico: Extraction of pico frames from clinical trial documents using llms.Methods, 226:78–88, 2024

work page 2024
[9]

Surabhi Datta, Kyeryoung Lee, Hunki Paek, Frank J Manion, Nneka Ofoegbu, Jingcheng Du, Ying Li, Liang-Chin Huang, Jingqi Wang, Bin Lin, et al. Autocriteria: a generalizable clinical trial eligibility criteria extraction system powered by large language models.Journal of the American Medical Informatics Association, 31(2):375–385, 2024

work page 2024
[10]

Zero-shot clinical trial patient matching with llms.NEJM AI, 2(1): AIcs2400360, 2025

Michael Wornow, Alejandro Lozano, Dev Dash, Jenelle Jindal, Kenneth W Mahaffey, and Nigam H Shah. Zero-shot clinical trial patient matching with llms.NEJM AI, 2(1): AIcs2400360, 2025. 14

work page 2025
[11]

Patient2trial: From patient to participant in clinical trials using large language models.Informatics in Medicine Unlocked, page 101615, 2025

Surabhi Datta, Kyeryoung Lee, Liang-Chin Huang, Hunki Paek, Roger Gildersleeve, Jonathan Gold, Deepak Pillai, Jingqi Wang, Mitchell K Higashi, Lizheng Shi, et al. Patient2trial: From patient to participant in clinical trials using large language models.Informatics in Medicine Unlocked, page 101615, 2025

work page 2025
[12]

Matching pa- tients to clinical trials with large language models.Nature communications, 15(1):9074, 2024

Qiao Jin, Zifeng Wang, Charalampos S Floudas, Fangyuan Chen, Changlin Gong, Dara Bracken-Clarke, Elisabetta Xue, Yifan Yang, Jimeng Sun, and Zhiyong Lu. Matching pa- tients to clinical trials with large language models.Nature communications, 15(1):9074, 2024

work page 2024
[13]

Seetrials: Leveraging large language models for safety and efficacy extraction in oncology clinical trials

Kyeryoung Lee, Hunki Paek, Liang-Chin Huang, C Beau Hilton, Surabhi Datta, Josh Hi- gashi, Nneka Ofoegbu, Jingqi Wang, Samuel M Rubinstein, Andrew J Cowan, et al. Seetrials: Leveraging large language models for safety and efficacy extraction in oncology clinical trials. medRxiv, pages 2024–01, 2024

work page 2024
[14]

Automatically extracting numerical results from randomized controlled trials with large language models

Hye Sun Yun, David Pogrebitskiy, Iain J Marshall, and Byron C Wallace. Automatically extracting numerical results from randomized controlled trials with large language models. arXiv preprint arXiv:2405.01686, 2024

work page arXiv 2024
[15]

Empowering meta-analysis: Lever- aging large language models for scientific synthesis

Jawad Ibn Ahad, Rafeed Mohammad Sultan, Abraham Kaikobad, Fuad Rahman, Moham- mad Ruhul Amin, Nabeel Mohammed, and Shafin Rahman. Empowering meta-analysis: Lever- aging large language models for scientific synthesis. In2024 IEEE International Conference on Big Data (BigData), pages 1615–1624. IEEE, 2024

work page 2024
[16]

The unified medical language system (umls): integrating biomedical terminology.Nucleic acids research, 32(suppl_1):D267–D270, 2004

Olivier Bodenreider. The unified medical language system (umls): integrating biomedical terminology.Nucleic acids research, 32(suppl_1):D267–D270, 2004

work page 2004
[17]

Clustering clinical trials with similar eligibility criteria features.Journal of biomedical informatics, 52: 112–120, 2014

Tianyong Hao, Alexander Rusanov, Mary Regina Boland, and Chunhua Weng. Clustering clinical trials with similar eligibility criteria features.Journal of biomedical informatics, 52: 112–120, 2014

work page 2014
[18]

Analysis of eligibility criteria clusters based on large language models for clinical trial design.Journal of the American Medical Informatics Association, 32(3):447–458, 2025

Alban Bornet, Philipp Khlebnikov, Florian Meer, Quentin Haas, Anthony Yazdani, Boya Zhang, Poorya Amini, and Douglas Teodoro. Analysis of eligibility criteria clusters based on large language models for clinical trial design.Journal of the American Medical Informatics Association, 32(3):447–458, 2025

work page 2025
[19]

A survey of llm-based agents in medicine: How far are we from baymax?arXiv preprint arXiv:2502.11211, 2025

WenxuanWang, ZizhanMa, ZhengWang, ChenghanWu, WentingChen, XiangLi, andYixuan Yuan. A survey of llm-based agents in medicine: How far are we from baymax?arXiv preprint arXiv:2502.11211, 2025

work page arXiv 2025
[20]

Toolformer: Language models can teach themselves to use tools.Advances in neural information processing systems, 36:68539– 68551, 2023

Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Eric Hambro, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom. Toolformer: Language models can teach themselves to use tools.Advances in neural information processing systems, 36:68539– 68551, 2023

work page 2023
[21]

Introducing codex.https://openai.com/index/introducing-codex/, 2025

OpenAI. Introducing codex.https://openai.com/index/introducing-codex/, 2025. Ac- cessed: 2026-03-17

work page 2025
[22]

Claude code.https://www.anthropic.com/claude-code, 2026

Anthropic. Claude code.https://www.anthropic.com/claude-code, 2026. Accessed: 2026- 03-17. 15

work page 2026
[23]

Statistical aspects of the analysis of data from retro- spective studies of disease.Journal of the national cancer institute, 22(4):719–748, 1959

Nathan Mantel and William Haenszel. Statistical aspects of the analysis of data from retro- spective studies of disease.Journal of the national cancer institute, 22(4):719–748, 1959

work page 1959
[24]

Nccn clinical practice guidelines in oncology: Gas- tric cancer (version 1.2026), 2026

National Comprehensive Cancer Network. Nccn clinical practice guidelines in oncology: Gas- tric cancer (version 1.2026), 2026. URLhttps://www.nccn.org/professionals/physician_ gls/pdf/gastric.pdf. Accessed: 2026-04-02

work page 2026
[25]

Introducing gpt-5.4.https://openai.com/index/introducing-gpt-5-4/, March

OpenAI. Introducing gpt-5.4.https://openai.com/index/introducing-gpt-5-4/, March

work page
[26]

Accessed: 2026-04-01

OpenAI product announcement. Accessed: 2026-04-01

work page 2026
[27]

Specific toxicity of maintenance olaparib versus placebo in advanced malignancies: a systematic review and meta-analysis.Anticancer Research, 40(2):597–608, 2020

Angela Dalia Ricci, Alessandro Rizzo, Marco Novelli, Simona Tavolari, Andrea Palloni, Nas- tassja Tober, Francesca Abbati, Veronica Mollica, Stefania De Lorenzo, Daniela Turchetti, et al. Specific toxicity of maintenance olaparib versus placebo in advanced malignancies: a systematic review and meta-analysis.Anticancer Research, 40(2):597–608, 2020

work page 2020
[28]

Maintenance olaparib for germline brca-mutated metastatic pancreatic cancer.New England Journal of Medicine, 381(4):317–327, 2019

Talia Golan, Pascal Hammel, Michele Reni, Eric Van Cutsem, Teresa Macarulla, Michael J Hall, Joon-Oh Park, Daniel Hochhauser, Dirk Arnold, Do-Youn Oh, et al. Maintenance olaparib for germline brca-mutated metastatic pancreatic cancer.New England Journal of Medicine, 381(4):317–327, 2019

work page 2019
[29]

Maintenance olaparib in patients with newly diagnosed advanced ovarian cancer.New England Journal of Medicine, 379(26):2495–2505, 2018

Kathleen Moore, Nicoletta Colombo, Giovanni Scambia, Byoung-Gie Kim, Ana Oaknin, Michael Friedlander, Alla Lisyanskaya, Anne Floquet, Alexandra Leary, Gabe S Sonke, et al. Maintenance olaparib in patients with newly diagnosed advanced ovarian cancer.New England Journal of Medicine, 379(26):2495–2505, 2018

work page 2018
[30]

Jonathan Ledermann, Philipp Harter, Charlie Gourley, Michael Friedlander, Ignace Vergote, Gordon Rustin, Clare L Scott, Werner Meier, Ronnie Shapira-Frommer, Tamar Safra, et al. Olaparib maintenance therapy in patients with platinum-sensitive relapsed serous ovarian can- cer: a preplanned retrospective analysis of outcomes by brca status in a randomised p...

work page 2014
[31]

Eric Pujade-Lauraine, Jonathan A Ledermann, Frédéric Selle, Val Gebski, Richard T Penson, Amit M Oza, Jacob Korach, Tomasz Huzarski, Andrés Poveda, Sandro Pignata, et al. Olaparib tablets as maintenance therapy in patients with platinum-sensitive, relapsed ovarian cancer and a brca1/2 mutation (solo2/engot-ov21): a double-blind, randomised, placebo-contro...

work page 2017
[32]

Eligibility-Aware Evidence Synthesis: An Agentic Framework for Clinical Trial Meta-Analysis

James Robins, Sander Greenland, and Norman E Breslow. A general estimator for the variance of the mantel haenszel odds ratio.American journal of epidemiology, 124(5):719–723, 1986. 16 Supplementary Materials for “Eligibility-Aware Evidence Synthesis: An Agentic Framework for Clinical Trial Meta-Analysis" A EligMeta Implementation Details A.1 LLM-based Rul...

work page 1986
[33]

Indication/Condition

work page
[34]

Endpoints: Overall Survival

work page
[35]

Variants of uncertain clinical significance

Endpoints: Progression-Free Survival 16 D.2 Results from GPT-5.4 and Codex Given the query without further manual intervention, both systems autonomously executed land- scape analyses. We present the workflow and outputs for each approach, followed by a comparative assessment of coverage, accuracy, and adherence to predefined eligibility criteria. Codex g...

work page 2019