When is p-hacking detectable?
Pith reviewed 2026-05-19 07:45 UTC · model grok-4.3
The pith
A projection test detects every form of p-hacking visible in the reported t-statistics.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that the distance between the smoothed empirical t-curve and the set of all possible honest distributions yields a sharp test for selective reporting. Any form of p-hacking that moves the observed curve away from every honest distribution will produce a positive test statistic, and the test cannot be evaded by a reporting strategy that still satisfies the observable restrictions on the t-curve.
What carries the argument
The projection test that finds the minimum distance from the smoothed empirical distribution of reported t-statistics to the set of all distributions consistent with honest reporting.
If this is right
- Histograms of t-statistics and p-values miss some detectable forms of selective reporting.
- The new test has power against every distortion that any valid test of the t-curve restrictions can detect.
- Application to existing meta-data shows statistically significant excess distortion in the t-curves of RCTs and IVs.
- Any evasion strategy must also evade every other valid test based on the same reported statistics.
Where Pith is reading between the lines
- Meta-analysts could apply the test routinely to flag research designs whose reported statistics are inconsistent with honest selection.
- The method could be extended to incorporate additional sources of benign distortion once they are formally characterized.
- Because the test is sharp, it sets a benchmark for what any future t-curve-based detector can hope to achieve.
Load-bearing premise
The set of possible honest distributions can be fully characterized from the reported t-statistics alone without additional information on the underlying data-generating process or the precise form of any benign distortions.
What would settle it
A selective reporting rule that produces an empirical t-curve lying strictly outside the honest set yet yields a projection distance of zero.
read the original abstract
We show that some forms of p-hacking cannot be detected by examining the histogram of t-statistics or their p-values. Even when p-hacking is detectable, standard tests may lack power. We propose a novel test that detects every form of selective reporting that is detectable from the distribution of reported t-statistics. Our test statistic is the distance between the smoothed empirical t-curve and the set of possible honest distributions. This projection test is sharp and can only be evaded by selective reporting that also evades all other valid tests of restrictions on the t-curve. We also show how to avoid spurious rejections caused by some benign distortions in the t-curve. Applying the test to the Brodeur et al. (2020) meta-dataset, we find that the t-curves for RCTs and IVs are more distorted than could arise by chance, (de)rounding, or the Student-t approximation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that some forms of p-hacking cannot be detected by examining the histogram of t-statistics or their p-values, and that even when detectable, standard tests may lack power. It proposes a novel projection test whose statistic is the distance between the smoothed empirical t-curve and the set of possible honest distributions; this test is asserted to be sharp in the sense that it detects every form of selective reporting that is detectable from the distribution of reported t-statistics. The authors also show how to avoid spurious rejections from benign distortions such as rounding and apply the test to the Brodeur et al. (2020) meta-dataset, concluding that t-curves for RCTs and IVs exhibit more distortion than can be explained by chance, (de)rounding, or the Student-t approximation.
Significance. If the sharpness claim and the characterization of the honest set hold, the paper would represent a meaningful methodological advance in the detection of selective reporting in empirical economics. The projection approach offers a complete test for all detectable manipulations of the t-curve, going beyond existing histogram-based methods, and the application to a large existing meta-dataset demonstrates practical utility. The explicit treatment of benign distortions is a useful practical contribution.
major comments (1)
- [Theoretical section defining the honest distribution set and the projection test] The central sharpness claim (abstract and theoretical development of the projection test) rests on correctly identifying the set of all possible honest t-curves from the reported t-statistics alone. Honest distributions are parameterized by degrees of freedom (hence sample size) and the precise form of continuous or discrete distortions; these parameters are not encoded in the t-values themselves. The manuscript must provide an explicit construction, algorithm, or proof showing how the honest set is recovered or bounded without additional DGP information. If this step relies on assumptions that are not recoverable from the t-statistics, the distance statistic can misclassify honest curves, undermining both the sharpness property and the claim that the test detects every detectable form of selective reporting.
minor comments (2)
- [Abstract] The abstract refers to a 'smoothed empirical t-curve' without specifying the smoothing kernel, bandwidth selection rule, or robustness checks; these details are needed for reproducibility and should be stated explicitly.
- [Empirical application section] The empirical application would benefit from Monte Carlo simulations or power calculations under controlled selective-reporting scenarios to illustrate finite-sample behavior of the test.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments. The referee correctly identifies that the sharpness of the projection test depends on a precise definition of the honest set. We address this point below and will revise the manuscript to improve clarity on the construction.
read point-by-point responses
-
Referee: [Theoretical section defining the honest distribution set and the projection test] The central sharpness claim (abstract and theoretical development of the projection test) rests on correctly identifying the set of all possible honest t-curves from the reported t-statistics alone. Honest distributions are parameterized by degrees of freedom (hence sample size) and the precise form of continuous or discrete distortions; these parameters are not encoded in the t-values themselves. The manuscript must provide an explicit construction, algorithm, or proof showing how the honest set is recovered or bounded without additional DGP information. If this step relies on assumptions that are not recoverable from the t-statistics, the distance statistic can misclassify honest curves, undermining both the sharpness property and the claim that the test detects every detectable form of selective re
Authors: The honest set is defined as the closure of the union, over all possible degrees of freedom and all admissible benign distortions (rounding to a stated precision, use of the t rather than normal approximation, and similar), of the distributions of reported t-statistics that can arise under honest reporting. Because the test statistic is the distance from the observed smoothed curve to this union, no specific df or distortion parameter needs to be recovered from the data; the projection simply finds the closest element in the set. The theoretical section provides a characterization of this set via the properties of the t-family and the admissible distortion operators, which is sufficient to establish sharpness: any selective reporting that produces a curve outside the set is detectable by the test, and any curve inside the set is consistent with some honest DGP. We acknowledge that an explicit computational algorithm for approximating the projection was only sketched rather than fully detailed. We will add a self-contained algorithmic description and a short proof that the set is closed under the relevant operations in the revised manuscript. revision: yes
Circularity Check
No circularity in projection test or honest-set characterization
full rationale
The paper defines its test statistic directly as the distance between the smoothed empirical t-curve and a theoretically characterized set of all possible honest t-distributions. This set is derived from standard properties of the t-distribution, degrees of freedom, and reporting rules rather than from the empirical data itself or any fitted parameters. The sharpness claim follows mathematically from the definition of projection onto that feasible set and does not reduce to a tautology or self-referential construction. No load-bearing self-citations, ansatzes, or renamings of known results appear in the core derivation; the approach remains independent of the specific dataset to which it is applied.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The set of possible honest t-distributions can be computed or approximated without knowledge of the original data-generating process.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.