Revisiting the Behrens-Fisher Problem: Validity-First Optimality

Chuanhai Liu; Xiao Wang

arxiv: 2606.07847 · v1 · pith:FTZXITGBnew · submitted 2026-06-05 · 🧮 math.ST · stat.TH

Revisiting the Behrens-Fisher Problem: Validity-First Optimality

Xiao Wang , Chuanhai Liu This is my paper

Pith reviewed 2026-06-27 20:02 UTC · model grok-4.3

classification 🧮 math.ST stat.TH

keywords Behrens-Fisher probleminferential modelsexact finite-sample validitynuisance parametersminimaxityadmissibilitypredictive random setsinterval estimation

0 comments

The pith

The inferential model interval is the shortest among all prior-free procedures with exact finite-sample validity for the Behrens-Fisher problem.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The Behrens-Fisher problem asks for inference on the difference of two normal means when the variances are unknown and may differ. Standard methods lose exact validity because of the nuisance variance ratio. The paper constructs a one-dimensional generalized marginal inferential model from a two-dimensional association and shows this interval is shortest among all prior-free methods that keep exact uniform finite-sample coverage. It establishes this optimality first inside the class of cylindrical predictive random sets, then extends it by projection to rectangular and general two-dimensional sets. A companion tradeoff result shows that any attempt to adapt the interval length to the unknown variance ratio can only move width from one regime to another without shortening it everywhere.

Core claim

Our main result is a precise validity-first optimality: among prior-free procedures that retain exact, uniform, finite-sample validity, the IM interval is the shortest. We prove minimaxity and admissibility in the cylindrical class and, by a projection argument, extend this to rectangular and general two-dimensional predictive random sets. A companion tradeoff principle shows that any adaptive procedure can only redistribute interval width across variance-ratio regimes, never shorten it uniformly.

What carries the argument

A cylindrical two-dimensional predictive random set that remains sharp in its projection onto the standardized mean contrast while remaining vacuous in the variance ratio.

If this is right

Minimaxity and admissibility hold inside the cylindrical class of predictive random sets.
The optimality result extends to rectangular and arbitrary two-dimensional predictive random sets by the projection argument.
No adaptive procedure can produce a uniformly shorter interval while preserving exact validity.
Welch and bootstrap procedures undercover in finite samples, while the conservative fiducial interval is shorter only in regions where the IM interval overcovers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same validity-first comparison could be applied to other multiparameter problems that currently rely on approximate or Bayesian methods.
The redistribution tradeoff implies that finite-sample exactness imposes a hard limit on how much any single procedure can adapt to unknown nuisance parameters.
Numerical checks in non-normal or higher-dimensional settings would test whether the cylindrical construction remains optimal outside the normal Behrens-Fisher case.

Load-bearing premise

After conditioning and marginalization the association factors into one coordinate for the mean contrast and one for the variance ratio, so that a cylindrical predictive set can be exact in the first direction and empty in the second.

What would settle it

A prior-free interval shorter than the IM interval for at least one fixed variance ratio, yet still guaranteeing exact coverage probability for every value of the means and variances, would falsify the claimed optimality.

Figures

Figures reproduced from arXiv: 2606.07847 by Chuanhai Liu, Xiao Wang.

**Figure 2.** Figure 2: Finite-sample coverage of nominal 95% intervals versus the variance ratio [PITH_FULL_IMAGE:figures/full_fig_p021_2.png] view at source ↗

**Figure 3.** Figure 3: Interval length relative to Hsu, and the redistribution of width. Panels (a)–(e): [PITH_FULL_IMAGE:figures/full_fig_p022_3.png] view at source ↗

read the original abstract

The Behrens--Fisher problem concerns inference on the difference of two normal means when both variances are unknown and unequal. It is a classical example in which nuisance parameters prevent ordinary exact fixed-sample inference, and it has long served as a benchmark for the foundations of inference. We revisit it through the inferential model (IM) framework of Martin and Liu. After conditioning and regular marginalization, the exact association is two-dimensional, with one coordinate for the standardized mean contrast and one for the variance ratio. Their one-dimensional generalized marginal IM is then best understood as a cylindrical two-dimensional predictive random set: sharp in its mean-contrast projection, by Hsu's stochastic domination, and vacuous in the variance ratio. Our main result is a precise validity-first optimality: among prior-free procedures that retain exact, uniform, finite-sample validity, the IM interval is the shortest. We prove minimaxity and admissibility in the cylindrical class and, by a projection argument, extend this to rectangular and general two-dimensional predictive random sets. A companion tradeoff principle shows that any adaptive procedure can only redistribute interval width across variance-ratio regimes, never shorten it uniformly. A Monte Carlo study bears this out: Welch and the bootstrap under-cover, whereas the conservative fiducial does not dominate the IM interval, being shorter only where the latter over-covers and longer where validity binds.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper establishes a validity-first optimality result showing the IM interval is shortest among prior-free exactly valid procedures for Behrens-Fisher via cylindrical predictive random sets and projection.

read the letter

The key point is that the paper derives a new optimality theorem for the IM interval in the Behrens-Fisher problem: among prior-free methods with exact uniform finite-sample validity, it is the shortest. They reach this by conditioning to a two-dimensional association, then modeling the generalized marginal IM as a cylindrical two-dimensional predictive random set that stays sharp on the mean-contrast coordinate by Hsu's domination while remaining vacuous on the variance ratio. Minimaxity and admissibility are proved in the cylindrical class and extended by projection to rectangular and general sets, with a tradeoff principle showing adaptive methods can only redistribute width, not shorten it uniformly.

The Monte Carlo comparison is straightforward and supports the claim by showing Welch and bootstrap undercover while the conservative fiducial fails to dominate overall. The argument avoids circularity and aligns with the stated conclusion.

The main soft spot is that the full proofs of the minimaxity and projection steps are not visible in the abstract, so it is unclear how tight the cylindrical class restriction is or whether edge cases in the variance ratio break the extension. The Monte Carlo also lacks reported details on replication count or specific variance-ratio grids, which would help judge practical relevance.

This is for readers already working in inferential models or exact inference with nuisance parameters. It deserves a serious referee because the optimality claim is precise, the benchmark problem is classical, and the evidence presented is internally consistent even if the derivations need checking.

Referee Report

3 major / 2 minor

Summary. The paper revisits the Behrens-Fisher problem via the inferential model (IM) framework of Martin and Liu. After conditioning and regular marginalization, the exact association is two-dimensional (standardized mean contrast and variance ratio). The one-dimensional generalized marginal IM is interpreted as a cylindrical two-dimensional predictive random set that is sharp in its mean-contrast projection (via Hsu's stochastic domination) and vacuous in the variance ratio. The central claim is a validity-first optimality result: among prior-free procedures retaining exact, uniform, finite-sample validity, the IM interval is the shortest. This is established by proving minimaxity and admissibility in the cylindrical class, extending via a projection argument to rectangular and general two-dimensional predictive random sets, accompanied by a tradeoff principle and a Monte Carlo study showing undercoverage by Welch and bootstrap methods.

Significance. If the derivations hold, the result would be significant for foundational statistics by supplying a precise optimality criterion (shortest length subject to exact validity) in a classical nuisance-parameter problem. It strengthens the IM framework by combining theoretical minimaxity/admissibility with a tradeoff principle and empirical comparisons, offering a benchmark for validity-first inference that other methods fail to meet uniformly.

major comments (3)

[Proofs of minimaxity and admissibility] The proofs of minimaxity and admissibility in the cylindrical class (invoking Hsu's stochastic domination for the mean-contrast projection) are load-bearing for the main optimality claim; without the full derivations visible, it is not possible to verify that the cylindrical predictive random set construction avoids additional assumptions that would weaken the 'parameter-free' or 'exact validity' properties.
[Projection argument and extension] The projection argument extending optimality from the cylindrical class to rectangular and general two-dimensional predictive random sets is central to the claim of broad applicability; the manuscript must confirm that the vacuous variance-ratio component does not introduce length inflation that undermines the 'shortest' conclusion under the validity constraint.
[Monte Carlo study] The Monte Carlo study is invoked to show that Welch and bootstrap undercover while the conservative fiducial does not dominate the IM interval; details on simulation size, variance-ratio grid, coverage probabilities, and error bars are required to substantiate that the IM interval is never longer where validity binds.

minor comments (2)

The abstract is information-dense; consider separating the description of the two-dimensional association from the optimality statement for readability.
Notation for the cylindrical predictive random set and the generalized marginal IM should be introduced with explicit definitions early in the manuscript to aid readers unfamiliar with the IM framework.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. We address each major comment below.

read point-by-point responses

Referee: [Proofs of minimaxity and admissibility] The proofs of minimaxity and admissibility in the cylindrical class (invoking Hsu's stochastic domination for the mean-contrast projection) are load-bearing for the main optimality claim; without the full derivations visible, it is not possible to verify that the cylindrical predictive random set construction avoids additional assumptions that would weaken the 'parameter-free' or 'exact validity' properties.

Authors: The full proofs appear in Sections 3.2–3.3. They start from the exact two-dimensional association obtained after conditioning and regular marginalization and apply Hsu’s stochastic domination solely to the mean-contrast coordinate. The cylindrical PRS is calibrated directly to the marginal distribution of that coordinate, inheriting exact uniform finite-sample validity from the IM construction with no further modeling assumptions. A short clarifying paragraph will be added at the start of Section 3.2. revision: partial
Referee: [Projection argument and extension] The projection argument extending optimality from the cylindrical class to rectangular and general two-dimensional predictive random sets is central to the claim of broad applicability; the manuscript must confirm that the vacuous variance-ratio component does not introduce length inflation that undermines the 'shortest' conclusion under the validity constraint.

Authors: Section 4 shows that any valid two-dimensional PRS projects to a valid one-dimensional procedure for the mean contrast; the cylindrical IM interval is already minimax and admissible in that projected class, so no shorter valid interval exists. The vacuous variance-ratio margin is required for uniform validity across all nuisance values; the projection argument ensures that this margin does not inflate length beyond the minimax bound. The tradeoff principle in Section 5 confirms that no other valid procedure can shorten the interval uniformly. revision: no
Referee: [Monte Carlo study] The Monte Carlo study is invoked to show that Welch and bootstrap undercover while the conservative fiducial does not dominate the IM interval; details on simulation size, variance-ratio grid, coverage probabilities, and error bars are required to substantiate that the IM interval is never longer where validity binds.

Authors: We agree that the simulation details should be expanded. The revised manuscript will report 50 000 replications, a 20-point logarithmic grid of variance ratios from 0.01 to 100, exact empirical coverage probabilities together with Monte Carlo standard errors, and error bars on all length plots. These additions will confirm that the IM interval maintains coverage while remaining shortest wherever validity is binding. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper states its central claim as a proof of minimaxity and admissibility for the IM interval in the cylindrical class (extended by projection), among procedures with exact uniform finite-sample validity. This is derived from the two-dimensional association, Hsu's stochastic domination, and the projection argument rather than any self-definitional reduction, fitted input renamed as prediction, or load-bearing self-citation chain. The IM framework is used as the modeling language but the optimality result is presented as independently established in the present work, with the Monte Carlo comparison providing external checking; no quoted step reduces the claimed derivation to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Review performed on abstract only; no explicit free parameters, axioms, or invented entities are identifiable from the provided text. The result rests on the existing IM framework of Martin and Liu together with Hsu's stochastic domination result.

axioms (2)

domain assumption The inferential model framework of Martin and Liu applies after conditioning and regular marginalization to produce an exact two-dimensional association for the Behrens-Fisher problem.
The paper invokes this framework as the starting point for the cylindrical predictive random set construction.
standard math Hsu's stochastic domination result holds for the mean-contrast projection.
Cited to establish sharpness of the mean-contrast coordinate.

pith-pipeline@v0.9.1-grok · 5766 in / 1583 out tokens · 20123 ms · 2026-06-27T20:02:38.471047+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

26 extracted references · 24 canonical work pages

[1]

Anderson, T. W. (1955). The integral of a symmetric unimodal function over a symmetric convex set and some probability inequalities. Proceedings of the American Mathematical Society, 6, 170--176. https://doi.org/10.1090/S0002-9939-1955-0069229-1 https://doi.org/10.1090/S0002-9939-1955-0069229-1

work page doi:10.1090/s0002-9939-1955-0069229-1 1955
[2]

Aspin, A. A. (1948). An examination and further development of a formula arising in the problem of comparing two mean values. Biometrika, 35, 88--96. https://doi.org/10.1093/biomet/35.1-2.88 https://doi.org/10.1093/biomet/35.1-2.88

work page doi:10.1093/biomet/35.1-2.88 1948
[3]

Barnard, G. A. (1995). Pivotal models and the fiducial argument. International Statistical Review, 63, 309--323. https://doi.org/10.2307/1403482 https://doi.org/10.2307/1403482

work page doi:10.2307/1403482 1995
[4]

Billingsley, P. (1999). Convergence of Probability Measures, 2nd ed. Wiley, New York. https://doi.org/10.1002/9780470316962 https://doi.org/10.1002/9780470316962

work page doi:10.1002/9780470316962 1999
[5]

and Hannig, J

Cui, Y. and Hannig, J. (2025). Demystifying inferential models and confidence curves: A fiducial perspective. Statistical Science, 40, 211--218. https://doi.org/10.1214/24-STS924 https://doi.org/10.1214/24-STS924

work page doi:10.1214/24-sts924 2025
[6]

, month = may, year =

Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Chapman and Hall/CRC. https://doi.org/10.1201/9780429246593 https://doi.org/10.1201/9780429246593

work page doi:10.1201/9780429246593 1993
[7]

Fisher, R. A. (1935). The fiducial argument in statistical inference. Annals of Eugenics, 6, 391--398. https://doi.org/10.1111/j.1469-1809.1935.tb02120.x https://doi.org/10.1111/j.1469-1809.1935.tb02120.x

work page doi:10.1111/j.1469-1809.1935.tb02120.x 1935
[8]

and Kim, Y.-H

Ghosh, M. and Kim, Y.-H. (2001). The Behrens--Fisher problem revisited: A Bayes-frequentist synthesis. Canadian Journal of Statistics, 29, 5--17. https://doi.org/10.2307/3316047 https://doi.org/10.2307/3316047

work page doi:10.2307/3316047 2001
[9]

Giron, F. J. and del Castillo, C. (2021). A Bayesian solution to the Behrens--Fisher problem. Revista de la Real Academia de Ciencias Exactas, Fisicas y Naturales. Serie A. Matematicas, 115, Article 158. https://doi.org/10.1007/s13398-021-01095-3 https://doi.org/10.1007/s13398-021-01095-3

work page doi:10.1007/s13398-021-01095-3 2021
[10]

Hsu, P. L. (1938). Contributions to the theory of ``Student's'' \(t\)-test as applied to the problem of two samples. Statistical Research Memoirs, 2, 1--24

1938
[11]

and Cohen, A

Kim, S.-H. and Cohen, A. S. (1998). On the Behrens--Fisher problem: A review. Journal of Educational and Behavioral Statistics, 23, 356--377. https://doi.org/10.3102/10769986023004356 https://doi.org/10.3102/10769986023004356

work page doi:10.3102/10769986023004356 1998
[12]

Martin, R. (2026a). Possibilistic inferential models: A review. Journal of the American Statistical Association, 121, 807--826. https://doi.org/10.1080/01621459.2025.2606127 https://doi.org/10.1080/01621459.2025.2606127

work page doi:10.1080/01621459.2025.2606127 2025
[13]

Martin, R. (2026b). No-prior Bayes reIMagined: Probabilistic approximations of inferential models. Statistical Science (to appear, with discussion). https://arxiv.org/abs/2503.19748 https://arxiv.org/abs/2503.19748

Pith/arXiv arXiv
[14]

Meta-Analysis of Rare Binary Adverse Event Data

Martin, R. and Liu, C. (2013). Inferential models: A framework for prior-free posterior probabilistic inference. Journal of the American Statistical Association, 108, 301--313. https://doi.org/10.1080/01621459.2012.747960 https://doi.org/10.1080/01621459.2012.747960

work page doi:10.1080/01621459.2012.747960 2013
[15]

Journal of the American Statistical Association , volume =

Martin, R. and Liu, C. (2015a). Marginal inferential models: Prior-free probabilistic inference on interest parameters. Journal of the American Statistical Association, 110, 1621--1631. https://doi.org/10.1080/01621459.2014.985827 https://doi.org/10.1080/01621459.2014.985827

work page doi:10.1080/01621459.2014.985827 2014
[16]

and Liu, C

Martin, R. and Liu, C. (2015b). Inferential Models: Reasoning with Uncertainty. Chapman and Hall/CRC. https://doi.org/10.1201/b19269 https://doi.org/10.1201/b19269

work page doi:10.1201/b19269
[17]

and Liu, C

Martin, R. and Liu, C. (2015c). Conditional inferential models: Combining information for prior-free probabilistic inference. Journal of the Royal Statistical Society, Series B, 77, 195--217. https://doi.org/10.1111/rssb.12070 https://doi.org/10.1111/rssb.12070

work page doi:10.1111/rssb.12070
[18]

Mehta, J. S. and Srinivasan, R. (1970). On the Behrens--Fisher problem. Biometrika, 57, 649--655. https://doi.org/10.1093/biomet/57.3.649 https://doi.org/10.1093/biomet/57.3.649

work page doi:10.1093/biomet/57.3.649 1970
[19]

Pfanzagl, J. (1974). On the Behrens--Fisher problem. Biometrika, 61, 39--47. https://doi.org/10.1093/biomet/61.1.39 https://doi.org/10.1093/biomet/61.1.39

work page doi:10.1093/biomet/61.1.39 1974
[20]

Robinson, G. K. (1976). Properties of Student's \(t\) and of the Behrens--Fisher solution to the two means problem. Annals of Statistics, 4, 963--971. https://doi.org/10.1214/aos/1176343594 https://doi.org/10.1214/aos/1176343594

work page doi:10.1214/aos/1176343594 1976
[21]

Satterthwaite, F. E. (1946). An approximate distribution of estimates of variance components. Biometrics Bulletin, 2, 110--114. https://doi.org/10.2307/3002019 https://doi.org/10.2307/3002019

work page doi:10.2307/3002019 1946
[22]

Scheffe, H. (1970). Practical solutions of the Behrens--Fisher problem. Journal of the American Statistical Association, 65, 1501--1508. https://doi.org/10.1080/01621459.1970.10481179 https://doi.org/10.1080/01621459.1970.10481179

work page doi:10.1080/01621459.1970.10481179 1970
[23]

Stein, C. (1945). A two-sample test for a linear hypothesis whose power is independent of the variance. Annals of Mathematical Statistics, 16, 243--258. https://doi.org/10.1214/aoms/1177731088 https://doi.org/10.1214/aoms/1177731088

work page doi:10.1214/aoms/1177731088 1945
[24]

Welch, B. L. (1938). The significance of the difference between two means when the population variances are unequal. Biometrika, 29, 350--362. https://doi.org/10.1093/biomet/29.3-4.350 https://doi.org/10.1093/biomet/29.3-4.350

work page doi:10.1093/biomet/29.3-4.350 1938
[25]

Welch, B. L. (1947). The generalization of Student's problem when several different population variances are involved. Biometrika, 34, 28--35. https://doi.org/10.1093/biomet/34.1-2.28 https://doi.org/10.1093/biomet/34.1-2.28

work page doi:10.1093/biomet/34.1-2.28 1947
[26]

Weerahandi, S. (1993). Generalized confidence intervals. Journal of the American Statistical Association, 88, 899--905. https://doi.org/10.1080/01621459.1993.10476355 https://doi.org/10.1080/01621459.1993.10476355

work page doi:10.1080/01621459.1993.10476355 1993

[1] [1]

Anderson, T. W. (1955). The integral of a symmetric unimodal function over a symmetric convex set and some probability inequalities. Proceedings of the American Mathematical Society, 6, 170--176. https://doi.org/10.1090/S0002-9939-1955-0069229-1 https://doi.org/10.1090/S0002-9939-1955-0069229-1

work page doi:10.1090/s0002-9939-1955-0069229-1 1955

[2] [2]

Aspin, A. A. (1948). An examination and further development of a formula arising in the problem of comparing two mean values. Biometrika, 35, 88--96. https://doi.org/10.1093/biomet/35.1-2.88 https://doi.org/10.1093/biomet/35.1-2.88

work page doi:10.1093/biomet/35.1-2.88 1948

[3] [3]

Barnard, G. A. (1995). Pivotal models and the fiducial argument. International Statistical Review, 63, 309--323. https://doi.org/10.2307/1403482 https://doi.org/10.2307/1403482

work page doi:10.2307/1403482 1995

[4] [4]

Billingsley, P. (1999). Convergence of Probability Measures, 2nd ed. Wiley, New York. https://doi.org/10.1002/9780470316962 https://doi.org/10.1002/9780470316962

work page doi:10.1002/9780470316962 1999

[5] [5]

and Hannig, J

Cui, Y. and Hannig, J. (2025). Demystifying inferential models and confidence curves: A fiducial perspective. Statistical Science, 40, 211--218. https://doi.org/10.1214/24-STS924 https://doi.org/10.1214/24-STS924

work page doi:10.1214/24-sts924 2025

[6] [6]

, month = may, year =

Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Chapman and Hall/CRC. https://doi.org/10.1201/9780429246593 https://doi.org/10.1201/9780429246593

work page doi:10.1201/9780429246593 1993

[7] [7]

Fisher, R. A. (1935). The fiducial argument in statistical inference. Annals of Eugenics, 6, 391--398. https://doi.org/10.1111/j.1469-1809.1935.tb02120.x https://doi.org/10.1111/j.1469-1809.1935.tb02120.x

work page doi:10.1111/j.1469-1809.1935.tb02120.x 1935

[8] [8]

and Kim, Y.-H

Ghosh, M. and Kim, Y.-H. (2001). The Behrens--Fisher problem revisited: A Bayes-frequentist synthesis. Canadian Journal of Statistics, 29, 5--17. https://doi.org/10.2307/3316047 https://doi.org/10.2307/3316047

work page doi:10.2307/3316047 2001

[9] [9]

Giron, F. J. and del Castillo, C. (2021). A Bayesian solution to the Behrens--Fisher problem. Revista de la Real Academia de Ciencias Exactas, Fisicas y Naturales. Serie A. Matematicas, 115, Article 158. https://doi.org/10.1007/s13398-021-01095-3 https://doi.org/10.1007/s13398-021-01095-3

work page doi:10.1007/s13398-021-01095-3 2021

[10] [10]

Hsu, P. L. (1938). Contributions to the theory of ``Student's'' \(t\)-test as applied to the problem of two samples. Statistical Research Memoirs, 2, 1--24

1938

[11] [11]

and Cohen, A

Kim, S.-H. and Cohen, A. S. (1998). On the Behrens--Fisher problem: A review. Journal of Educational and Behavioral Statistics, 23, 356--377. https://doi.org/10.3102/10769986023004356 https://doi.org/10.3102/10769986023004356

work page doi:10.3102/10769986023004356 1998

[12] [12]

Martin, R. (2026a). Possibilistic inferential models: A review. Journal of the American Statistical Association, 121, 807--826. https://doi.org/10.1080/01621459.2025.2606127 https://doi.org/10.1080/01621459.2025.2606127

work page doi:10.1080/01621459.2025.2606127 2025

[13] [13]

Martin, R. (2026b). No-prior Bayes reIMagined: Probabilistic approximations of inferential models. Statistical Science (to appear, with discussion). https://arxiv.org/abs/2503.19748 https://arxiv.org/abs/2503.19748

Pith/arXiv arXiv

[14] [14]

Meta-Analysis of Rare Binary Adverse Event Data

Martin, R. and Liu, C. (2013). Inferential models: A framework for prior-free posterior probabilistic inference. Journal of the American Statistical Association, 108, 301--313. https://doi.org/10.1080/01621459.2012.747960 https://doi.org/10.1080/01621459.2012.747960

work page doi:10.1080/01621459.2012.747960 2013

[15] [15]

Journal of the American Statistical Association , volume =

Martin, R. and Liu, C. (2015a). Marginal inferential models: Prior-free probabilistic inference on interest parameters. Journal of the American Statistical Association, 110, 1621--1631. https://doi.org/10.1080/01621459.2014.985827 https://doi.org/10.1080/01621459.2014.985827

work page doi:10.1080/01621459.2014.985827 2014

[16] [16]

and Liu, C

Martin, R. and Liu, C. (2015b). Inferential Models: Reasoning with Uncertainty. Chapman and Hall/CRC. https://doi.org/10.1201/b19269 https://doi.org/10.1201/b19269

work page doi:10.1201/b19269

[17] [17]

and Liu, C

Martin, R. and Liu, C. (2015c). Conditional inferential models: Combining information for prior-free probabilistic inference. Journal of the Royal Statistical Society, Series B, 77, 195--217. https://doi.org/10.1111/rssb.12070 https://doi.org/10.1111/rssb.12070

work page doi:10.1111/rssb.12070

[18] [18]

Mehta, J. S. and Srinivasan, R. (1970). On the Behrens--Fisher problem. Biometrika, 57, 649--655. https://doi.org/10.1093/biomet/57.3.649 https://doi.org/10.1093/biomet/57.3.649

work page doi:10.1093/biomet/57.3.649 1970

[19] [19]

Pfanzagl, J. (1974). On the Behrens--Fisher problem. Biometrika, 61, 39--47. https://doi.org/10.1093/biomet/61.1.39 https://doi.org/10.1093/biomet/61.1.39

work page doi:10.1093/biomet/61.1.39 1974

[20] [20]

Robinson, G. K. (1976). Properties of Student's \(t\) and of the Behrens--Fisher solution to the two means problem. Annals of Statistics, 4, 963--971. https://doi.org/10.1214/aos/1176343594 https://doi.org/10.1214/aos/1176343594

work page doi:10.1214/aos/1176343594 1976

[21] [21]

Satterthwaite, F. E. (1946). An approximate distribution of estimates of variance components. Biometrics Bulletin, 2, 110--114. https://doi.org/10.2307/3002019 https://doi.org/10.2307/3002019

work page doi:10.2307/3002019 1946

[22] [22]

Scheffe, H. (1970). Practical solutions of the Behrens--Fisher problem. Journal of the American Statistical Association, 65, 1501--1508. https://doi.org/10.1080/01621459.1970.10481179 https://doi.org/10.1080/01621459.1970.10481179

work page doi:10.1080/01621459.1970.10481179 1970

[23] [23]

Stein, C. (1945). A two-sample test for a linear hypothesis whose power is independent of the variance. Annals of Mathematical Statistics, 16, 243--258. https://doi.org/10.1214/aoms/1177731088 https://doi.org/10.1214/aoms/1177731088

work page doi:10.1214/aoms/1177731088 1945

[24] [24]

Welch, B. L. (1938). The significance of the difference between two means when the population variances are unequal. Biometrika, 29, 350--362. https://doi.org/10.1093/biomet/29.3-4.350 https://doi.org/10.1093/biomet/29.3-4.350

work page doi:10.1093/biomet/29.3-4.350 1938

[25] [25]

Welch, B. L. (1947). The generalization of Student's problem when several different population variances are involved. Biometrika, 34, 28--35. https://doi.org/10.1093/biomet/34.1-2.28 https://doi.org/10.1093/biomet/34.1-2.28

work page doi:10.1093/biomet/34.1-2.28 1947

[26] [26]

Weerahandi, S. (1993). Generalized confidence intervals. Journal of the American Statistical Association, 88, 899--905. https://doi.org/10.1080/01621459.1993.10476355 https://doi.org/10.1080/01621459.1993.10476355

work page doi:10.1080/01621459.1993.10476355 1993