Computation of Least Trimmed Squares: A Branch-and-Bound framework with Hyperplane Arrangement Enhancements

Andr\'es G\'omez; Rahul Mazumder; Xiang Meng

arxiv: 2604.11584 · v2 · submitted 2026-04-13 · 🧮 math.OC · cs.LG· math.ST· stat.TH

Computation of Least Trimmed Squares: A Branch-and-Bound framework with Hyperplane Arrangement Enhancements

Xiang Meng , Andr\'es G\'omez , Rahul Mazumder This is my paper

Pith reviewed 2026-05-10 15:14 UTC · model grok-4.3

classification 🧮 math.OC cs.LGmath.STstat.TH

keywords least trimmed squaresrobust regressionmixed-integer optimizationbranch-and-boundhyperplane arrangementsperspective reformulationoutlier detectionfirst-order methods

0 comments

The pith

A new mixed-integer formulation for penalized least trimmed squares embeds hyperplane arrangement logic to produce a branch-and-bound tree of polynomial size in the sample count when the number of features is fixed.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a mixed-integer optimization model for the penalized least trimmed squares problem that incorporates hyperplane arrangement structure directly into a perspective reformulation. This enforcement of solution properties yields a branch-and-bound search tree whose size grows only polynomially with the number of observations when the feature dimension stays constant. A specialized branch-and-bound solver then applies first-order methods together with dual bounds to resolve the continuous relaxations at each node rapidly. Experiments show the approach closes a 1 percent optimality gap on 5000-sample, 20-feature instances in roughly one minute, while earlier formulations remain unsolved after an hour.

Core claim

Embedding hyperplane arrangement logic into the perspective reformulation of the penalized LTS problem produces a mixed-integer program whose branch-and-bound tree is polynomial in the sample size for any fixed number of features; the same reformulation also supplies tight dual bounds that, when paired with first-order methods, allow a tailored solver to solve node relaxations efficiently enough to reach practical optimality gaps on large instances.

What carries the argument

The embedding of hyperplane arrangement logic into the perspective reformulation, which explicitly enforces structural properties of optimal LTS solutions and thereby controls the size of the resulting branch-and-bound tree.

If this is right

Exact penalized LTS solutions become feasible for sample sizes of several thousand observations in low-dimensional regimes.
The tailored first-order dual-bound solver reaches 1 percent gaps on 5000 by 20 instances in minutes rather than hours.
The same reformulation and bounding technique can be reused for other combinatorial robust estimation tasks that admit similar hyperplane structure.
Larger exact robust regression problems become tractable without resorting to heuristics or approximations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The polynomial-tree guarantee suggests that similar hyperplane-based reformulations could tame other NP-hard regression problems whose combinatorial structure is governed by linear separators.
Because the bound depends on fixed dimension, extensions that adaptively reduce effective dimension or exploit sparsity may be needed before the method scales to hundreds of features.
The dual bounds produced by the perspective reformulation might transfer to other mixed-integer solvers for trimmed estimators, reducing reliance on generic cutting planes.

Load-bearing premise

The hyperplane arrangement embedding correctly encodes the structural properties of optimal solutions and thereby delivers the claimed polynomial bound on tree size.

What would settle it

A single instance with fixed number of features on which the branch-and-bound tree explored by the new formulation grows exponentially with sample size would falsify the polynomial-size claim.

Figures

Figures reproduced from arXiv: 2604.11584 by Andr\'es G\'omez, Rahul Mazumder, Xiang Meng.

**Figure 1.** Figure 1: Various robust loss functions. The objective has a capped quadratic loss ϕcap(r) with an additional ridge regularization with penalty parameter λ/2 ≥ 0. The loss r 7→ ϕcap(r) is quadratic for small (absolute) residuals but saturates at the constant level µ for large (absolute) residuals, discouraging grossly contaminated observations from dominating the fit. The nonconvex capped quadratic belongs to a b… view at source ↗

read the original abstract

We study computational aspects of a key problem in robust statistics -- the penalized least trimmed squares (LTS) regression problem, a robust estimator that mitigates the influence of outliers in data by capping residuals with large magnitudes. Although statistically attractive, penalized LTS is NP-hard, and existing mixed-integer optimization (MIO) formulations scale poorly due to weak relaxations and exponential worst-case complexity in the number of observations. We propose a new MIO formulation that embeds hyperplane arrangement logic into a perspective reformulation, explicitly enforcing structural properties of optimal solutions. We show that, if the number of features is fixed, the resulting branch-and-bound tree is of polynomial size in the sample size. Moreover, we develop a tailored branch-and-bound algorithm that uses first-order methods with dual bounds to solve node relaxations efficiently. Computational experiments on synthetic and real datasets demonstrate substantial improvements over existing MIO approaches: on synthetic instances with 5000 samples and 20 features, our tailored solver reaches a 1% gap in 1 minute while competing approaches fail to do so within one hour. These gains enable exact robust regression at significantly larger sample sizes in low-dimensional settings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a perspective reformulation of penalized LTS that folds in hyperplane arrangement structure to get a polynomial-size branch-and-bound tree for fixed dimension, plus a custom solver that shows clear speedups on n=5000 instances.

read the letter

The main contribution is a mixed-integer formulation for penalized least trimmed squares that embeds hyperplane arrangement logic inside a perspective reformulation. This enforces key structural properties of the optimal solutions and yields a branch-and-bound tree whose size is polynomial in the sample size when the number of features stays fixed. They also build a tailored solver that solves the node relaxations with first-order methods and dual bounds, which appears to be the practical engine behind the reported gains.

Referee Report

2 major / 2 minor

Summary. The paper introduces a new mixed-integer optimization formulation for the penalized least trimmed squares (LTS) problem by embedding hyperplane arrangement logic into a perspective reformulation. It establishes that the branch-and-bound tree size is polynomial in the sample size for fixed feature dimension, develops a tailored branch-and-bound algorithm using first-order methods and dual bounds for efficient node solving, and reports significant computational improvements over existing methods on synthetic and real datasets.

Significance. If the polynomial bound on the branch-and-bound tree holds, this contribution is significant as it provides a theoretically grounded and practically efficient exact method for a key robust regression problem, overcoming the scalability limitations of prior MIO formulations in low-dimensional settings. The tailored solver's performance gains, as demonstrated on instances with 5000 samples, highlight its potential to enable larger-scale exact computations in robust statistics.

major comments (2)

[§3] §3 (formulation and bound derivation): The central claim that the B&B tree is of polynomial size in n for fixed p rests on the hyperplane arrangement embedding correctly limiting feasible integer solutions within the perspective reformulation; the manuscript must explicitly show how the arrangement integrates to produce the O(n^{O(p)}) node bound without hidden exponential factors in the branching decisions.
[§4.2] §4.2 (node relaxation solver): The use of first-order methods with dual bounds is load-bearing for the reported efficiency gains (e.g., 1% gap in 1 min on n=5000, p=20 instances); the tightness of these bounds relative to the reformulated problem and their convergence behavior under the hyperplane constraints need concrete verification to support the superiority over competing MIO solvers.

minor comments (2)

[Abstract] Abstract and §1: The distinction between 'penalized LTS' and standard LTS could be clarified with a short equation reference to the objective.
[§5] Figure captions and experimental section: Ensure all synthetic data generation parameters (e.g., outlier fractions, noise levels) are fully specified for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment and recommendation for minor revision. We address each major comment below and will revise the manuscript accordingly to incorporate the requested clarifications.

read point-by-point responses

Referee: [§3] §3 (formulation and bound derivation): The central claim that the B&B tree is of polynomial size in n for fixed p rests on the hyperplane arrangement embedding correctly limiting feasible integer solutions within the perspective reformulation; the manuscript must explicitly show how the arrangement integrates to produce the O(n^{O(p)}) node bound without hidden exponential factors in the branching decisions.

Authors: We appreciate this observation. The current proof in §3 establishes the polynomial bound by showing that the hyperplane arrangement induces at most O(n^{O(p)}) distinct cells, each corresponding to a unique combinatorial structure of the optimal LTS solution, and that the perspective reformulation's integer variables are constant within each cell. Branching occurs only on the arrangement's defining hyperplanes, which are linear in the residuals and thus do not introduce additional exponential factors beyond the arrangement complexity. In the revised manuscript we will expand this argument with an explicit inductive counting of nodes per level of the tree, confirming that the total size remains O(n^{O(p)}) for fixed p with no hidden exponential terms from the branching rule. revision: yes
Referee: [§4.2] §4.2 (node relaxation solver): The use of first-order methods with dual bounds is load-bearing for the reported efficiency gains (e.g., 1% gap in 1 min on n=5000, p=20 instances); the tightness of these bounds relative to the reformulated problem and their convergence behavior under the hyperplane constraints need concrete verification to support the superiority over competing MIO solvers.

Authors: We agree that explicit verification strengthens the claims. In the revision we will add a new subsection (or appendix) reporting the duality gap between the first-order method's dual bounds and the true node relaxation optima on a representative sample of instances (including the n=5000, p=20 cases). We will also include convergence plots showing iteration counts to reach 1% gap under the hyperplane constraints, demonstrating that the strengthened formulation yields tighter bounds than standard perspective relaxations and explains the observed speed-up relative to off-the-shelf MIO solvers. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper's central result—that a perspective reformulation embedding hyperplane arrangement logic yields a branch-and-bound tree of polynomial size in n for fixed feature dimension p—rests on standard combinatorial geometry (the number of regions defined by hyperplanes in fixed dimension is polynomial) applied to the feasible set of the new MIO formulation. This is an external structural property, not a self-definition, fitted parameter renamed as prediction, or load-bearing self-citation. No equations or claims in the abstract reduce the polynomial bound to the inputs by construction; the tailored first-order dual bounds for node relaxations are algorithmic enhancements independent of the size claim. The approach is self-contained against external benchmarks in combinatorial optimization.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on standard results from mixed-integer optimization and combinatorial geometry without introducing new free parameters or invented entities.

axioms (1)

domain assumption Hyperplane arrangement properties and perspective reformulations preserve the optimal solution structure for the LTS problem.
Invoked to justify the polynomial bound on the branch-and-bound tree when dimension is fixed.

pith-pipeline@v0.9.0 · 5518 in / 1223 out tokens · 21322 ms · 2026-05-10T15:14:30.810939+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages

[1]

Agull´o J (2001b) New algorithms for computing the least trimmed squares regression estimator.Computational statis- tics & data analysis36(4):425–439

Agull´o J (2001a) New algorithms for computing the least trimmed squares regression estimator.Computational Statis- tics & Data Analysis36(4):425–439. Agull´o J (2001b) New algorithms for computing the least trimmed squares regression estimator.Computational statis- tics & data analysis36(4):425–439. Akt¨urk MS, Atamt ¨urk A, G ¨urel S (2009) A strong con...

work page 2009
[2]

Bertsimas D, King A, Mazumder R (2016) Best subset selection via a modern optimization lens.The annals of statistics 44(2):813–852

Bertsekas DP (1982)Constrained Optimization and Lagrange Multiplier Methods(Academic Press). Bertsimas D, King A, Mazumder R (2016) Best subset selection via a modern optimization lens.The annals of statistics 44(2):813–852. Meng, G´omez and Mazumder:Computation of LTS: BnB and Hyperplane Arrangements arXiv preprint 35 Bertsimas D, Mazumder R (2014) Least...

work page doi:10.1002/mrm.23268 1982

[1] [1]

Agull´o J (2001b) New algorithms for computing the least trimmed squares regression estimator.Computational statis- tics & data analysis36(4):425–439

Agull´o J (2001a) New algorithms for computing the least trimmed squares regression estimator.Computational Statis- tics & Data Analysis36(4):425–439. Agull´o J (2001b) New algorithms for computing the least trimmed squares regression estimator.Computational statis- tics & data analysis36(4):425–439. Akt¨urk MS, Atamt ¨urk A, G ¨urel S (2009) A strong con...

work page 2009

[2] [2]

Bertsimas D, King A, Mazumder R (2016) Best subset selection via a modern optimization lens.The annals of statistics 44(2):813–852

Bertsekas DP (1982)Constrained Optimization and Lagrange Multiplier Methods(Academic Press). Bertsimas D, King A, Mazumder R (2016) Best subset selection via a modern optimization lens.The annals of statistics 44(2):813–852. Meng, G´omez and Mazumder:Computation of LTS: BnB and Hyperplane Arrangements arXiv preprint 35 Bertsimas D, Mazumder R (2014) Least...

work page doi:10.1002/mrm.23268 1982