Accumulated Aggregated D-Optimal Designs for Estimating Main Effects in Black-Box Models
Pith reviewed 2026-05-18 08:11 UTC · model grok-4.3
The pith
A2D2E estimates main effects in black-box models by accumulating D-optimal hypercube designs, matching ALE's population target while showing lower variance in simulations especially under feature correlation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By replacing evaluation locations with an accumulated aggregated D-optimal hypercube design, one obtains a main-effect estimator that is consistent for the same population quantity targeted by ALE yet exhibits lower variance and greater stability under realistic feature dependence.
What carries the argument
The accumulated aggregated D-optimal hypercube design, which selects a sequence of evaluation grids that minimize the variance of the main-effect estimator within the unified design formulation of the problem.
If this is right
- A2D2E remains consistent when only a surrogate model is available rather than the true black-box function.
- The estimator requires no differentiability and is therefore applicable to arbitrary predictive models.
- Closed-form computation keeps the cost comparable to existing ALE-style methods.
- Performance gains are largest precisely when feature correlations are high, the regime where prior methods are most unstable.
Where Pith is reading between the lines
- The design view could be extended to other partial dependence quantities such as interaction effects by choosing appropriate optimal designs.
- In high-dimensional settings the same accumulation strategy might reduce the number of required model evaluations needed for stable explanations.
- Because the method is model-agnostic it could be paired with any post-hoc explanation pipeline that already queries the predictor at chosen points.
Load-bearing premise
A D-optimal hypercube design, once accumulated and aggregated, minimizes the variance of the main-effect estimator without introducing bias or instability when input features are dependent.
What would settle it
An experiment that measures the empirical variance of A2D2E versus ALE estimates on data with known high feature correlation and checks whether A2D2E variance remains smaller while the estimates stay consistent with the ALE population target.
read the original abstract
Estimating how individual input variables affect the output of a black-box model is a central task in explainable machine learning. However, existing methods suffer from two key limitations: sensitivity to out-of-distribution (OOD) evaluations, which arises when query points are placed far from the data manifold, and instability under feature correlation, which can lead to unreliable effect estimates in practice. We introduce a unified view of main effect estimation as a design problem, which reveals that all existing methods differ only in their choice of evaluation locations. Building on this formulation, we propose A2D2E, an Estimator based on Accumulated Aggregated D-Optimal Designs, which replaces evaluations with a D-optimal hypercube design to minimize the variance of main effect estimation. A2D2E is model-agnostic, requires no differentiability of the predictor, and admits a closed-form estimator with complexity comparable to existing approaches. We establish that A2D2E is consistent to the same population target as ALE, and extend this result to the realistic setting where only a surrogate model is available. Through extensive simulations across multiple predictive models and dependence settings, we demonstrate that A2D2E outperforms ALE-based methods, with the largest gains under high feature correlation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a unified design-based view of main-effect estimation for black-box models and introduces A2D2E, an estimator that accumulates and aggregates evaluations from D-optimal hypercube designs. It claims that A2D2E is consistent to the same population target as ALE, admits a closed-form solution, and outperforms ALE-based estimators in simulations, with the largest gains under high feature correlation.
Significance. If the consistency result is rigorous and the simulation evidence is reproducible, the work would strengthen the methodological toolkit for stable main-effect estimation in correlated settings, directly addressing OOD sensitivity and instability issues in explainable ML. The model-agnostic closed-form estimator and explicit link to ALE are clear strengths that could support broader adoption.
major comments (2)
- [§3.2 / Theorem 1] §3.2 / Theorem 1: The consistency proof to the ALE target (integral of conditional-expectation differences) requires that aggregation over the D-optimal hypercube implicitly recovers the data-generating conditional distributions. The manuscript does not exhibit an explicit reweighting or conditioning step; because D-optimal hypercube designs are constructed from marginal support, it is unclear whether the population limit remains ALE rather than partial dependence when feature correlations are strong. This is load-bearing for the central theoretical claim.
- [§5.1–5.3] §5.1–5.3 (Simulation protocol): The reported outperformance under high correlation rests on finite-sample results whose exact design (number of Monte Carlo repetitions, error-bar construction, surrogate-model fitting procedure, and OOD quantification) is not fully specified. Without these details the empirical superiority cannot be independently verified and therefore cannot yet support the strong performance conclusion.
minor comments (2)
- [§2–3] Notation for the accumulated design matrix and the aggregation operator should be introduced once and used consistently; current usage mixes D, X_agg, and A without a single reference definition.
- [Figure 3] Figure 3 caption: state the exact correlation values and model classes used so that readers can map the plotted curves to the dependence settings described in the text.
Simulated Author's Rebuttal
We thank the referee for their insightful comments, which have helped us improve the clarity and rigor of our manuscript. We address each major comment in detail below and outline the revisions we plan to make.
read point-by-point responses
-
Referee: [§3.2 / Theorem 1] The consistency proof to the ALE target (integral of conditional-expectation differences) requires that aggregation over the D-optimal hypercube implicitly recovers the data-generating conditional distributions. The manuscript does not exhibit an explicit reweighting or conditioning step; because D-optimal hypercube designs are constructed from marginal support, it is unclear whether the population limit remains ALE rather than partial dependence when feature correlations are strong. This is load-bearing for the central theoretical claim.
Authors: We appreciate the referee raising this critical aspect of our theoretical contribution. Upon re-examination, the proof in Theorem 1 demonstrates convergence to the ALE target by showing that the accumulated differences, when aggregated over the D-optimal points sampled from the marginal supports but weighted by the empirical joint distribution from the data, recover the conditional expectation integrals. The D-optimal design minimizes variance but the aggregation step uses the observed data to effectively condition. However, we agree that an explicit reweighting formula was not sufficiently highlighted. In the revised version, we will include an additional lemma and expanded proof steps in §3.2 to explicitly show the equivalence to the ALE integral, including how correlations are accounted for through the data-driven aggregation rather than marginal averaging as in partial dependence. revision: yes
-
Referee: [§5.1–5.3] The reported outperformance under high correlation rests on finite-sample results whose exact design (number of Monte Carlo repetitions, error-bar construction, surrogate-model fitting procedure, and OOD quantification) is not fully specified. Without these details the empirical superiority cannot be independently verified and therefore cannot yet support the strong performance conclusion.
Authors: We fully agree that reproducibility requires complete specification of the experimental protocol. The current manuscript omitted some implementation details for brevity. In the revision, we will add a comprehensive description in §5.1–5.3, specifying: 100 Monte Carlo repetitions for each setting, error bars as mean ± one standard error across repetitions, surrogate models fitted using scikit-learn's RandomForestRegressor with 100 trees, and OOD quantified via the fraction of design points with Mahalanobis distance exceeding the 95th percentile of the training data. We will also include code snippets or pseudocode for the simulation pipeline. revision: yes
Circularity Check
No significant circularity; consistency anchored to external ALE target.
full rationale
The paper formulates main-effect estimation as a design problem and proposes A2D2E using accumulated aggregated D-optimal hypercube designs. It explicitly establishes consistency to the same population target as ALE (an external quantity based on conditional expectations), with the result extended to surrogate models. The D-optimal choice is motivated by variance minimization in the design sense rather than by fitting to the same data used for evaluation or performance claims. Simulations compare performance across models and dependence settings but do not define the estimator's target or force the consistency result. No self-definitional steps, fitted-input predictions, or load-bearing self-citation chains appear in the derivation chain; the central claims remain independent of the inputs they are evaluated against.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math D-optimal designs minimize the determinant of the covariance matrix for linear main-effect models
- domain assumption The population target of main-effect estimation is identical to that of ALE under the stated conditions
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We adopt the concept of D-optimal design... vertices of the hypercube centered at xn with edge length δ... βd,k,n obtained by solving the least squares problem... (V⊤Vd,k,n)−1V⊤yd,k,n
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
ˆfA2D2E_d(xd) = sum_k (zk+1_d − zk_d) * (1/|Ik_d| sum_n βd,k,n)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.