Sparse Unit-Sum Regression

Nick Koning; Paul Bekker

arxiv: 1907.04620 · v1 · pith:NXBOJZDFnew · submitted 2019-07-10 · 📊 stat.ME · stat.CO

Sparse Unit-Sum Regression

Nick Koning , Paul Bekker This is my paper

Pith reviewed 2026-05-24 23:42 UTC · model grok-4.3

classification 📊 stat.ME stat.CO

keywords sparse regressionunit-sum constraintl0 regularizationl1 regularizationindex trackingportfolio optimizationmixed integer programminglinear regression

0 comments

The pith

A mix of l0 and l1 penalties produces sparser solutions than l1 alone for linear regression with unit-sum weights.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a regularization method for linear regression where coefficients must sum exactly to one. It combines an l0 penalty that counts non-zeros with an l1 penalty on absolute sizes. The resulting problem is solved by adapting an existing mixed-integer optimization technique. Simulations show competitive prediction accuracy and higher sparsity than pure l0 or l1 approaches, and an index-tracking example yields portfolios with substantially fewer assets while matching l1 tracking performance.

Core claim

We consider sparsity in linear regression under the restriction that the regression weights sum to one. We propose an approach that combines ℓ0- and ℓ1-regularization. We compute its solution by adapting a recent methodological innovation made by Bertsimas et al. (2016) for ℓ0-regularization in standard linear regression. In a simulation experiment we compare our approach to ℓ0-regularization and ℓ1-regularization and find that it performs favorably in terms of predictive performance and sparsity. In an application to index tracking we show that our approach can obtain substantially sparser portfolios compared to ℓ1-regularization while maintaining a similar tracking performance.

What carries the argument

Combined ℓ0-ℓ1 regularization under the added linear constraint that weights sum to one, solved by adapting the mixed-integer quadratic programming method of Bertsimas et al. (2016).

If this is right

The method yields regression models with fewer active coefficients while retaining predictive accuracy under the unit-sum requirement.
In portfolio construction it selects fewer assets than l1 regularization for comparable index tracking.
It provides a practical compromise between the exact sparsity of pure l0 and the easier optimization of pure l1 when the sum constraint is present.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same adaptation could be tested on other linear equality constraints that appear in compositional or probability-simplex regression.
Advances in mixed-integer solvers would directly widen the range of dimensions where unit-sum sparse regression becomes routine.
If the performance edge holds on new data sets, practitioners facing sum-to-one constraints may prefer the mixed penalty over l1 alone.

Load-bearing premise

The adaptation of the Bertsimas et al. mixed-integer solver remains effective and computationally tractable after the sum-to-one constraint is imposed.

What would settle it

Re-running the index-tracking application and finding that the combined method either selects as many assets as l1 regularization or produces materially worse tracking error.

read the original abstract

This paper considers sparsity in linear regression under the restriction that the regression weights sum to one. We propose an approach that combines $\ell_0$- and $\ell_1$-regularization. We compute its solution by adapting a recent methodological innovation made by Bertsimas et al. (2016) for $\ell_0$-regularization in standard linear regression. In a simulation experiment we compare our approach to $\ell_0$-regularization and $\ell_1$-regularization and find that it performs favorably in terms of predictive performance and sparsity. In an application to index tracking we show that our approach can obtain substantially sparser portfolios compared to $\ell_1$-regularization while maintaining a similar tracking performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adapts Bertsimas MIP to unit-sum regression and reports sparser index-tracking portfolios than plain l1 while keeping similar performance.

read the letter

The main takeaway is that this paper gives a direct way to handle sparse regression when the coefficients are forced to sum to one. They blend l0 and l1 penalties and solve the resulting problem by modifying the mixed-integer program from Bertsimas et al. 2016 to include the linear equality constraint. The simulation shows the combined approach beats the separate regularizers on both prediction and sparsity, and the index-tracking application produces substantially sparser portfolios than l1 regularization alone with comparable tracking error. That application result is the clearest practical signal in the work. The adaptation itself is straightforward because the sum-to-one constraint is linear and fits inside the existing MIP without changing the core structure. The authors obtain usable solutions for their portfolio example, which indicates the added constraint does not destroy tractability at the problem sizes they consider. The stress-test worry about branch-and-bound degradation therefore does not appear to materialize in the reported experiments. The main limitations sit in the level of detail. The abstract states that the method performs favorably in simulation but supplies no numbers on design, dimensions, or exact metrics, so it is hard to judge how consistent the gains are across settings. The application claim is stronger because it is tied to a real task, yet even there the paper would be easier to evaluate with the actual sparsity counts and tracking errors rather than the qualitative summary. This is a specialized tool aimed at statisticians or quantitative researchers who work with sum-to-one constraints, such as in finance or compositional data. A reader facing exactly that setting could take the method and apply it to similar problems. It is not a foundational shift, but it supplies a concrete, implementable extension with evidence from both controlled checks and an external data example. The work deserves peer review. The computational grounding in prior MIP results is solid, the application demonstrates usable value, and the central claim holds up on the evidence given. A referee could reasonably ask for fuller experimental reporting and scaling checks on the MIP, but those are normal revision points rather than reasons to stop the process.

Referee Report

3 major / 2 minor

Summary. The paper proposes combining ℓ0- and ℓ1-regularization for linear regression subject to the unit-sum constraint on coefficients, solved via an adaptation of the Bertsimas et al. (2016) mixed-integer program. It reports that the method performs favorably versus pure ℓ0- and ℓ1-regularization in simulations on predictive performance and sparsity, and that in an index-tracking application it yields substantially sparser portfolios than ℓ1-regularization while preserving similar tracking error.

Significance. If the computational claims hold, the approach would supply a usable tool for sparse regression under linear equality constraints, with direct relevance to portfolio construction. The explicit adaptation of an existing MIP solver is a methodological strength when accompanied by evidence that the added equality does not materially degrade tractability.

major comments (3)

[Abstract / Application] Abstract and application section: the central claim that the method obtains substantially sparser portfolios while maintaining similar tracking performance is presented without quantitative values for sparsity levels, tracking-error differences, number of assets, or MIP solve statistics (optimality gaps, run times, or node counts).
[Method / Application] Method and application sections: the adaptation of Bertsimas et al. (2016) inserts the linear equality 1^T β = 1 directly into the MIP, yet no analysis or numerical evidence is supplied on how this coupling affects the tightness of the continuous relaxation or the number of branch-and-bound nodes explored at the problem sizes used in the index-tracking example.
[Simulation experiment] Simulation experiment: the claim of favorable performance versus ℓ0- and ℓ1-regularization lacks any description of experimental design, number of replications, error bars, data-generation process, or exact quantitative comparisons, so the support for the performance claim cannot be evaluated.

minor comments (2)

[Method] Notation for the combined penalty and the precise form of the MIP objective after adding the sum-to-one constraint should be written explicitly rather than described only in prose.
[Abstract] The abstract would benefit from one or two concrete numerical results (e.g., average sparsity or tracking-error values) to substantiate the qualitative claims.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments. We agree that the manuscript requires additional quantitative details and descriptions to support its claims, and we will revise accordingly.

read point-by-point responses

Referee: [Abstract / Application] Abstract and application section: the central claim that the method obtains substantially sparser portfolios while maintaining similar tracking performance is presented without quantitative values for sparsity levels, tracking-error differences, number of assets, or MIP solve statistics (optimality gaps, run times, or node counts).

Authors: We agree that specific quantitative values are needed to substantiate the claims. In the revised manuscript we will report the exact sparsity levels achieved, tracking-error values and differences, number of assets, and MIP solve statistics (run times, optimality gaps, node counts) for the index-tracking example. revision: yes
Referee: [Method / Application] Method and application sections: the adaptation of Bertsimas et al. (2016) inserts the linear equality 1^T β = 1 directly into the MIP, yet no analysis or numerical evidence is supplied on how this coupling affects the tightness of the continuous relaxation or the number of branch-and-bound nodes explored at the problem sizes used in the index-tracking example.

Authors: We acknowledge the absence of analysis on the effect of the unit-sum constraint on the MIP relaxation and branching. In revision we will add numerical evidence from the index-tracking instances, including reported solve times, gaps, and (where feasible) comparisons of relaxation bounds or node counts with and without the equality constraint. revision: yes
Referee: [Simulation experiment] Simulation experiment: the claim of favorable performance versus ℓ0- and ℓ1-regularization lacks any description of experimental design, number of replications, error bars, data-generation process, or exact quantitative comparisons, so the support for the performance claim cannot be evaluated.

Authors: We agree that the simulation section is insufficiently documented. The revised manuscript will include a complete description of the data-generation process, number of replications, performance metrics with error bars or standard errors, and tables of exact quantitative comparisons against the ℓ0 and ℓ1 baselines. revision: yes

Circularity Check

0 steps flagged

No circularity: adaptation of external Bertsimas MIP with empirical validation

full rationale

The paper explicitly frames its core contribution as an adaptation of the Bertsimas et al. (2016) mixed-integer programming approach for l0-regularization, extended by the linear equality constraint 1^T beta = 1. Performance claims rest on simulation experiments comparing predictive performance and sparsity, plus an index-tracking application showing sparser portfolios with comparable tracking error. These are external empirical benchmarks, not quantities defined by the model's own fitted parameters or self-citations. No self-definitional steps, fitted-input predictions, or load-bearing self-citation chains appear. The derivation is therefore self-contained against external references and data.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no free parameters, axioms, or invented entities are identifiable from the provided text. The work relies on standard regularization penalties and an external optimization technique.

pith-pipeline@v0.9.0 · 5636 in / 1032 out tokens · 22484 ms · 2026-05-24T23:42:05.075646+00:00 · methodology

Sparse Unit-Sum Regression

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)