Recognition: no theorem link
Remote, bivariate expert elicitation to determine the prior probability distribution for sample size calculation in a Bayesian non-inferiority multicenter randomized controlled trial (Croup Dosing Trial)
Pith reviewed 2026-05-13 20:03 UTC · model grok-4.3
The pith
Remote workshops with twelve physicians elicited a bivariate prior centered at 6% and 8% event rates that set the sample size at 1850 for a Bayesian croup dosing trial.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Three remote workshops elicited beliefs from twelve emergency medicine physicians on the probabilities of return visits within 7 days for children with croup treated with 0.60 mg/kg versus 0.15 mg/kg dexamethasone. After literature review and group discussion in two survey rounds, the aggregated bivariate prior was centered at 6% for the higher dose and 8% for the lower dose; combined with the stated non-inferiority margin of 4%, this prior determined a required sample size of 1850 for the multicenter randomized trial.
What carries the argument
Bivariate prior distributions elicited remotely from experts and aggregated across participants to determine sample size in a Bayesian non-inferiority trial.
If this is right
- The elicited prior can be used directly in the Bayesian analysis of the Croup Dosing Trial data.
- Remote elicitation removes the need for in-person meetings, making expert input practical for multicenter trial planning.
- Bivariate distributions explicitly model the dependence between event rates under the two doses.
- The resulting prior is consistent with published data on dexamethasone efficacy for croup.
Where Pith is reading between the lines
- The same remote protocol could be applied to other pediatric conditions where randomized evidence is sparse.
- Validation against actual trial outcomes would indicate whether remote bivariate elicitation produces usable priors for regulatory or clinical decision-making.
- Sensitivity checks on different aggregation rules for multiple experts could strengthen future applications of this method.
Load-bearing premise
The twelve physicians' stated beliefs after literature review and discussion accurately represent the true uncertainty about the two doses without systematic bias introduced by the remote format or the aggregation method.
What would settle it
Observing actual event rates in the completed Croup Dosing Trial that lie well outside the 6% and 8% centers of the elicited prior would show that the elicitation failed to capture the relevant uncertainty.
Figures
read the original abstract
Prior distributions must be specified for the parameters of interest in a Bayesian clinical trial. When existing evidence on the effects of the trial interventions is limited, prior distributions can be constructed with expert elicitation. However, conventional elicitation requires face-to-face interactions and intensive pre-elicitation training, which can be infeasible. Our remote elicitation was based on established expert elicitation methods. We used bivariate prior distributions for dependencies between elicited quantities. We elicited a prior distribution for the Croup Dosing Trial, which will assess the number of return visits to the emergency department within 7 days in children with croup. This trial evaluates the non-inferiority of 0.15 mg/kg of dexamethasone, compared to the standard dose of 0.60 mg/kg to treat croup. We conducted three remote workshops to elicit expert beliefs on the efficacy of the two doses of dexamethasone. Each workshop consisted of two survey rounds, separated by a group discussion. Prior to the workshop, experts reviewed provided literature on the effects of the two doses of dexamethasone. Beliefs were aggregated with expert-specific bivariate distributions. The aggregated distribution and surveyed non-inferiority margin determined the sample size. Twelve emergency medicine physicians participated in our remote elicitation exercise. The elicitation generated a prior distribution centered at 6% for the 0.60 mg/kg dose and 8% for the 0.15 mg/kg dose. The aggregated prior distribution produced a sample size of 1850, based on a non-inferiority margin of 4%. We elicited a prior distribution that incorporated past evidence and expert opinion. The elicited prior is consistent with literature on the efficacy of the dexamethasone doses in treating croup. Our approach demonstrates the feasibility of remotely eliciting bivariate distributions for clinical trials.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents a remote expert elicitation protocol using three workshops (literature review plus two survey rounds separated by group discussion) with 12 emergency medicine physicians to construct bivariate prior distributions on the 7-day return-visit rates for 0.60 mg/kg versus 0.15 mg/kg dexamethasone in a planned Bayesian non-inferiority croup trial. The aggregated prior is centered at 6 % and 8 % respectively; combined with a 4 % non-inferiority margin this yields a sample size of 1850. The authors conclude that the exercise demonstrates the feasibility of remote bivariate elicitation that incorporates both literature and expert opinion.
Significance. If the remote protocol can be shown to produce priors whose location and dependence structure are free of format-induced bias, the work would supply a practical template for Bayesian sample-size calculations in multicenter trials where face-to-face elicitation is logistically impossible. The concrete numerical output (n = 1850) and the claim of consistency with existing croup literature give the result immediate design utility, provided the aggregation and validation steps are made reproducible.
major comments (3)
- [Methods (elicitation and aggregation)] Methods section on bivariate construction: the manuscript states that 'expert-specific bivariate distributions' were fitted and then aggregated, yet supplies neither the parametric family chosen for each expert, the copula or correlation parameter used to capture dependence between the two doses, nor the aggregation rule (e.g., linear pooling, logarithmic pooling, or Bayesian updating). Without these details the reported centers (6 % vs 8 %) and the derived sample size of 1850 cannot be independently verified or stress-tested.
- [Results and Discussion] Results/Discussion: no quantitative summary of between-expert dispersion (e.g., inter-quartile range or variance of the elicited probabilities before versus after aggregation) or sensitivity of the correlation parameter to the remote format is presented. The feasibility claim therefore rests on an untested assumption that the three remote workshops produce dependence structures interchangeable with established in-person protocols.
- [Abstract and Conclusion] Abstract and Conclusion: the assertion that the elicited prior 'is consistent with literature' is made without any explicit numerical comparison (e.g., overlap with published point estimates or credible intervals from prior croup studies). This comparison is load-bearing for the claim that the remote exercise successfully incorporated past evidence.
minor comments (2)
- [Abstract] The abstract introduces 'bivariate prior distributions for dependencies between elicited quantities' without immediately naming the two quantities (return-visit rates under each dose); a single clarifying sentence would improve readability.
- [Methods] Ensure the non-inferiority margin (4 %) and the precise definition of the primary endpoint (return visits within 7 days) are restated in the methods when the sample-size formula is applied.
Simulated Author's Rebuttal
We thank the referee for their thorough review and constructive comments on our manuscript. We address each of the major comments below and have made revisions to the manuscript where feasible to improve clarity and reproducibility.
read point-by-point responses
-
Referee: Methods section on bivariate construction: the manuscript states that 'expert-specific bivariate distributions' were fitted and then aggregated, yet supplies neither the parametric family chosen for each expert, the copula or correlation parameter used to capture dependence between the two doses, nor the aggregation rule (e.g., linear pooling, logarithmic pooling, or Bayesian updating). Without these details the reported centers (6 % vs 8 %) and the derived sample size of 1850 cannot be independently verified or stress-tested.
Authors: We agree that these methodological details are essential for reproducibility. In the revised version of the manuscript, we will expand the Methods section to explicitly state that marginal distributions were beta distributions fitted to each expert's elicited probabilities, dependence was modeled using a Gaussian copula with a correlation parameter derived from expert responses during the workshops, and aggregation was performed via linear pooling of the expert-specific bivariate distributions. We will also provide the specific parameters in a supplementary appendix to allow verification of the reported centers and sample size. revision: yes
-
Referee: Results/Discussion: no quantitative summary of between-expert dispersion (e.g., inter-quartile range or variance of the elicited probabilities before versus after aggregation) or sensitivity of the correlation parameter to the remote format is presented. The feasibility claim therefore rests on an untested assumption that the three remote workshops produce dependence structures interchangeable with established in-person protocols.
Authors: We acknowledge the value of quantifying between-expert dispersion. The revised manuscript will include a new table in the Results section summarizing the inter-quartile ranges and variances of the individual expert elicitations for both doses, before and after the group discussion phase. However, a direct empirical comparison of dependence structures between remote and in-person formats was not feasible within the scope of this study, as we did not conduct parallel in-person sessions; we will explicitly note this as a limitation in the Discussion and suggest it as an area for future research. revision: partial
-
Referee: Abstract and Conclusion: the assertion that the elicited prior 'is consistent with literature' is made without any explicit numerical comparison (e.g., overlap with published point estimates or credible intervals from prior croup studies). This comparison is load-bearing for the claim that the remote exercise successfully incorporated past evidence.
Authors: We agree that an explicit comparison strengthens the manuscript. In the revision, we will add a paragraph in the Discussion section (and update the Abstract and Conclusion accordingly) that provides direct numerical comparisons between our elicited prior means (6% and 8%) and published estimates from relevant croup studies, including their point estimates and any reported intervals. This will demonstrate consistency with the literature. revision: yes
Circularity Check
No circularity: elicited prior constructed from external expert input and literature
full rationale
The manuscript reports an expert-elicitation protocol whose output (bivariate prior centered at 6 % / 8 %, n = 1850) is produced by aggregating twelve physicians' stated beliefs after they reviewed supplied literature. No equation, aggregation formula, or sample-size calculation inside the paper is defined in terms of its own output; the numerical results are direct consequences of the external inputs. Standard elicitation citations are used only to describe the method, not to justify the numerical values themselves. The feasibility claim therefore rests on observable workshop outcomes rather than on any self-referential reduction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Expert physicians can provide calibrated beliefs about treatment effects after reviewing literature
- domain assumption Bivariate distributions adequately capture dependence between the two dose-specific probabilities
Reference graph
Works this paper leans on
-
[1]
Berry DA. Bayesian clinical trials. Nat Rev Drug Discov. 2006 Jan;5(1):27–36
work page 2006
-
[2]
Application of Bayesian approaches in drug development: starting a virtuous cycle
Ruberg SJ, Beckers F, Hemmings R, Honig P, Irony T, LaVange L, et al. Application of Bayesian approaches in drug development: starting a virtuous cycle. Nat Rev Drug Discov. 2023 Mar;22(3):235–50
work page 2023
-
[3]
Bayesian statistics and modelling
van de Schoot R, Depaoli S, King R, Kramer B, Märtens K, Tadesse MG, et al. Bayesian statistics and modelling. Nat Rev Methods Primers. 2021 Jan 14;1(1):1–26
work page 2021
-
[4]
Elicitation: The Science and Art of Structuring Judgement [Internet]
Dias LC, Morton A, Quigley J, editors. Elicitation: The Science and Art of Structuring Judgement [Internet]. 1st ed. Springer Cham; 2018 [cited 2024 June 20]. 542 p. Available from: https://doi.org/10.1007/978-3-319-65052-4
-
[5]
A practical guide to structured expert elicitation using the IDEA protocol
Hemming V , Burgman MA, Hanea AM, McBride MF, Wintle BC. A practical guide to structured expert elicitation using the IDEA protocol. Methods Ecol Evol. 2017 July 30;9(1):169–80
work page 2017
-
[6]
Prior Elicitation for Use in Clinical Trial Design and Analysis: A Literature Review
Azzolina D, Berchialla P, Gregori D, Baldi I. Prior Elicitation for Use in Clinical Trial Design and Analysis: A Literature Review. International Journal of Environmental Research and Public Health. 2021 Jan;18(4):1833
work page 2021
-
[7]
Bojke L, Soares M, Claxton K, Colson A, Fox A, Jackson C, et al. Developing a reference protocol for structured expert elicitation in health-care decision-making: a mixed-methods study. Health Technology Assessment. 2021 June 9;25(37):1–124
work page 2021
-
[8]
Gosling JP. SHELF: The Sheffield Elicitation Framework. In: Elicitation [Internet]. Springer; 2017 [cited 2024 July 22]. Available from: https://doi.org/10.1007/978-3-319- 65052-4_4
-
[9]
Elicitator: An expert elicitation tool for regression in ecology
James A, Choy SL, Mengersen K. Elicitator: An expert elicitation tool for regression in ecology. Environmental Modelling & Software. 2010 Jan 1;25(1):129–45
work page 2010
-
[10]
A valid and reliable belief elicitation method for Bayesian priors
Johnson SR, Tomlinson GA, Hawker GA, Granton JT, Grosbein HA, Feldman BM. A valid and reliable belief elicitation method for Bayesian priors. Journal of Clinical Epidemiology. 2010 Apr 1;63(4):370–83
work page 2010
-
[11]
A web-based tool for eliciting probability distributions from experts
Morris DE, Oakley JE, Crowe JA. A web-based tool for eliciting probability distributions from experts. Environmental Modelling & Software. 2014 Feb 1;52:1–4
work page 2014
-
[12]
Bjornson CL, Johnson DW. Croup. Lancet. 2008 Jan 26;371(9609):329–39
work page 2008
-
[13]
Brown JC. The management of croup. Br Med Bull. 2002;61:189–202
work page 2002
-
[14]
Steroid Treatment of Laryngotracheitis: A Meta- Analysis of the Evidence From Randomized Trials
Kairys SW, Olmstead EM, O’Connor GT. Steroid Treatment of Laryngotracheitis: A Meta- Analysis of the Evidence From Randomized Trials. Pediatrics. 1989 May 1;83(5):683–93
work page 1989
-
[15]
Geelhoed GC, Turner J, Macdonald WB. Efficacy of a small single dose of oral dexamethasone for outpatient croup: a double blind placebo controlled clinical trial. BMJ. 1996 July 20;313(7050):140–2
work page 1996
-
[16]
Association of Oral Corticosteroid Bursts With Severe Adverse Events in Children
Yao TC, Wang JY , Chang SM, Chang YC, Tsai YF, Wu AC, et al. Association of Oral Corticosteroid Bursts With Severe Adverse Events in Children. JAMA Pediatr. 2021 July 1;175(7):723–9
work page 2021
-
[17]
Reference Case Methods for Expert Elicitation in Health Care Decision Making
Bojke L, Soares MO, Claxton K, Colson A, Fox A, Jackson C, et al. Reference Case Methods for Expert Elicitation in Health Care Decision Making. Med Decis Making. 2022 Feb 1;42(2):182–93
work page 2022
-
[18]
Lan J, Plint AC, Dalziel SR, Klassen TP, Offringa M, Heath A, et al. Remote, real-time expert elicitation to determine the prior probability distribution for Bayesian sample size determination in international randomised controlled trials: Bronchiolitis in Infants Placebo Versus Epinephrine and Dexamethasone (BIPED) study. Trials. 2022 Apr 11;23(1):279
work page 2022
-
[19]
Combining Probability Distributions From Experts in Risk Analysis
Clemen RT, Winkler RL. Combining Probability Distributions From Experts in Risk Analysis. Risk Anal. 1999 Apr 1;19(2):187–203
work page 1999
-
[20]
Guidance on Expert Knowledge Elicitation in Food and Feed Safety Risk Assessment
European Food Safety Authority. Guidance on Expert Knowledge Elicitation in Food and Feed Safety Risk Assessment. EFSA Journal. 2014 June 19;12(6):3734
work page 2014
-
[21]
Werner C, Hanea AM, Morales-Nápoles O. Eliciting Multivariate Uncertainty from Experts: Considerations and Approaches Along the Expert Judgement Process. In: Dias LC, Morton A, Quigley J, editors. Elicitation: The Science and Art of Structuring Judgement [Internet]. Cham: Springer International Publishing; 2018 [cited 2024 July 22]. p. 171–210. Available ...
-
[22]
Hiance A, Chevret S, Lévy V . A practical approach for eliciting expert prior beliefs about cancer survival in phase III randomized trial. Journal of Clinical Epidemiology. 2009 Apr 1;62(4):431-437.e2
work page 2009
-
[23]
Hampson LV , Whitehead J, Eleftheriou D, Tudur-Smith C, Jones R, Jayne D, et al. Elicitation of Expert Prior Opinion: Application to the MYPAN Trial in Childhood Polyarteritis Nodosa. PLoS One. 2015 Mar 30;10(3):e0120981
work page 2015
-
[24]
Bayesian treatment comparison using parametric mixture priors computed from elicited histograms
Thall PF, Ursino M, Baudouin V , Alberti C, Zohar S. Bayesian treatment comparison using parametric mixture priors computed from elicited histograms. Stat Methods Med Res. 2019 Feb;28(2):404–18
work page 2019
-
[25]
Introduction to Bayesian Statistics
Curran JM, Bolstad WM. Introduction to Bayesian Statistics. 3rd ed. Wiley; 2016
work page 2016
-
[26]
R: A Language and Environment for Statistical Computing [Internet]
R Core Team. R: A Language and Environment for Statistical Computing [Internet]. Vienna, Austria: R Foundation for Statistical Computing; 2024 [cited 2024 July 24]. Available from: https://www.R-project.org/
work page 2024
-
[27]
shiny: Web Application Framework for R [Internet]
Chang W, Cheng J, Allaire J, Sievert C, Schloerke B, Xie Y , et al. shiny: Web Application Framework for R [Internet]. 2024. Available from: https://CRAN.R- project.org/package=shiny
work page 2024
-
[28]
prevalence: Tools for prevalence assessment studies [Internet]
Devleesschauwer B, Torgerson P, Charlier J, Levecke B, Praet N, Roelandt S, et al. prevalence: Tools for prevalence assessment studies [Internet]. 2022 [cited 2024 Aug 15]. Available from: https://cran.r-project.org/package=prevalence
work page 2022
-
[29]
Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap)—A metadata-driven methodology and workflow process for providing translational research informatics support. Journal of Biomedical Informatics. 2009 Apr 1;42(2):377–81
work page 2009
-
[30]
The REDCap consortium: Building an international community of software platform partners
Harris PA, Taylor R, Minor BL, Elliott V , Fernandez M, O’Neal L, et al. The REDCap consortium: Building an international community of software platform partners. Journal of Biomedical Informatics. 2019 July 1;95:103208
work page 2019
-
[31]
Klassen TP, Dalziel SR, Babl FE, Benito J, Bressan S, Chamberlain J, et al. The Pediatric Emergency Research Network (PERN): A decade of global research cooperation in paediatric emergency care. Emerg Med Australas. 2021 Oct;33(5):900–10
work page 2021
-
[32]
A note on aggregating opinions
Hogarth RM. A note on aggregating opinions. Organizational Behavior and Human Performance. 1978 Feb 1;21(1):40–6
work page 1978
-
[33]
Glucocorticoids for croup in children
Aregbesola A, Tam CM, Kothari A, Le ML, Ragheb M, Klassen TP. Glucocorticoids for croup in children. Cochrane Database Syst Rev. 2023 Jan 10;(1)
work page 2023
-
[34]
Prednisolone Versus Dexamethasone for Croup: a Randomized Controlled Trial
Parker CM, Cooper MN. Prednisolone Versus Dexamethasone for Croup: a Randomized Controlled Trial. Pediatrics. 2019 Sept;144(3):e20183772
work page 2019
-
[35]
Methods to elicit beliefs for Bayesian priors: a systematic review
Johnson SR, Tomlinson GA, Hawker GA, Granton JT, Feldman BM. Methods to elicit beliefs for Bayesian priors: a systematic review. Journal of Clinical Epidemiology. 2010 Apr 1;63(4):355–69
work page 2010
-
[36]
Rue H, Martino S, Chopin N. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. Journal of the Royal Statistical Society Series B: Statistical Methodology. 2009 Apr 6;71(2):319–92
work page 2009
-
[37]
Efficacy of a small dose of oral dexamethasone in croup
Alshehr M, Almegamsi T, Hammdi A. Efficacy of a small dose of oral dexamethasone in croup. Biomedical Research. 2005 Jan 1;16(1):65–72
work page 2005
-
[38]
Fifoot AA, Ting JY . Comparison between single-dose oral prednisolone and oral dexamethasone in the treatment of croup: A randomized, double-blinded clinical trial. Emergency Medicine Australasia. 2007;19(1):51–8
work page 2007
-
[39]
A re-evaluation of random-effects meta- analysis
Higgins JPT, Thompson SG, Spiegelhalter DJ. A re-evaluation of random-effects meta- analysis. Journal of the Royal Statistical Society: Series A (Statistics in Society). 2009;172(1):137–59
work page 2009
-
[40]
Jackson D, Bowden J, Baker R. How does the DerSimonian and Laird procedure for random effects meta-analysis compare with its more efficient but harder to compute counterparts? Journal of Statistical Planning and Inference. 2010 Apr 1;140(4):961–70
work page 2010
-
[41]
Janis IL. Victims of Groupthink: A Psychological Study of Foreign Policy Decisions and Fiascoes. Houghton Mifflin Co; 1972. Clinical Case Study for Elicitation Consider a previously healthy 1-year-old boy who presents to the ED in the autumn. He has been previously well and developed a runny nose a few days ago along with a mild fever. Today he developed ...
work page 1972
-
[42]
Read the instructions
-
[43]
Use the slider to choose the lower, upper and best plausible value
-
[44]
Check the summary and plot
-
[45]
I would recommend writing down the selected values
If the summary and plot match your opinion, enter the values in Step 2. I would recommend writing down the selected values. Page 11
-
[46]
Select the next Step 2 – Enter your values – Question 1 tab on the menu bar to enter your answers
-
[47]
Enter the Unique User ID emailed to you
-
[48]
Enter the Lower, Upper and Best Plausible value based on your opinion in Step 1
-
[49]
You should see a pop-up confirmation box
Follow the same steps in STEP 3 AND STEP 4 Be sure to hit submit. You should see a pop-up confirmation box. Demo - https://apoorvagangwani.shinyapps.io/EXAMPLE/ Now let’s go through an example. After reading the instructions, let’s review the question. You're a tech product manager, and you're deciding between two email notification frequencies for a mobi...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.