Constant-Factor Approximation for the Uniform Decision Tree

Micha{\l} Szyfelbein

arxiv: 2604.12036 · v2 · submitted 2026-04-13 · 💻 cs.DS · cs.IR· cs.LG

Constant-Factor Approximation for the Uniform Decision Tree

Micha{\l} Szyfelbein This is my paper

Pith reviewed 2026-05-10 15:13 UTC · model grok-4.3

classification 💻 cs.DS cs.IRcs.LG

keywords decision treeapproximation algorithmuniform distributionseparating subfamilymaximum coverageconstant factorpolynomial time

0 comments

The pith

A polynomial-time algorithm approximates the optimal uniform decision tree within a factor less than 11.57.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that the average-case decision tree problem admits a constant-factor approximation when all hypotheses are equally likely. It decomposes any optimal tree into a sequence of separating subfamilies and reduces the task of finding each subfamily to an instance of maximum coverage by cutting cliques into small pieces that each separate one pair of hypotheses. The resulting algorithm runs in polynomial time and achieves an approximation ratio bounded by 2 divided by 1 minus the square root of (e plus 1) over (2e), plus any positive epsilon, which is strictly less than 11.57 and improves on the previous O(log n over log log n) bound.

Core claim

We resolve the open question by giving a simple polynomial-time algorithm whose approximation ratio for the uniform decision tree problem is 2 over (1 minus the square root of (e plus 1) over (2e)) plus epsilon, a quantity less than 11.57. The algorithm works by first decomposing the optimal decision tree into separating subfamilies and then approximating each subfamily via a reduction to maximum coverage obtained by analyzing the effect of cutting cliques into pieces that represent pairs of hypotheses to be separated.

What carries the argument

The separating subfamily, obtained by decomposing the optimal decision tree and reducing the subproblem to maximum coverage through the analysis of cutting cliques into small pieces that separate hypothesis pairs.

Load-bearing premise

The decomposition of the optimal decision tree into separating subfamilies preserves the approximation guarantee when each subfamily is replaced by an approximate solution from the maximum coverage reduction.

What would settle it

An explicit family of hypotheses together with a decision tree whose expected cost is more than 11.57 times the optimal cost would show that the claimed ratio does not hold.

read the original abstract

We resolve a long-standing open question, about the existence of a constant-factor approximation algorithm for the average-case \textsc{Decision Tree} problem with uniform probability distribution over the hypotheses. We answer the question in the affirmative by providing a simple polynomial-time algorithm with approximation ratio of $\frac{2}{1-\sqrt{(e+1)/(2e)}}+\epsilon <11.57$. This improves upon the currently best-known, greedy algorithm which achieves $O(\log n/{\log\log n})$-approximation. The first key ingredient in our analysis is the usage of a decomposition technique known from problems related to \textsc{Hierarchical Clustering} [SODA '17, WALCOM '26], which allows us to decompose the optimal decision tree into a series of objects called separating subfamilies. The second crucial idea is to reduce the subproblem of finding a \textsc{Separating Subfamily} to an instance of the \textsc{Maximum Coverage} problem. To do so, we analyze the properties of cutting cliques into small pieces, which represent pairs of hypotheses to be separated. This allows us to obtain a good approximation for the \textsc{Separating Subfamily} problem, which then enables the design of the approximation algorithm for the original problem.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Claims first constant-factor approx for uniform decision trees at under 11.57 via decomposition and max-coverage reduction, but abstract leaves the key analysis steps uncheckable.

read the letter

The paper claims to resolve a long-standing open question by giving a constant-factor approximation algorithm for the uniform average-case decision tree problem. The ratio comes in at less than 11.57, which beats the prior O(log n over log log n) greedy result. It does this with two main ideas. First, it decomposes the optimal decision tree into separating subfamilies using a technique from hierarchical clustering work. Second, it reduces the task of finding a good separating subfamily to a maximum coverage instance by cutting cliques into pieces that correspond to pairs of hypotheses that need separating. This reduction lets them get an approximation for the subfamily subproblem and then lift it to the full tree. The approach looks promising on paper because it turns the problem into something more tractable with known approximation tools. If the analysis of the decomposition and the clique cutting holds up, it gives a clean polynomial-time algorithm that finally achieves a constant. The soft spots are in the details we can't see. The abstract does not include the proof that the decomposition preserves the approximation guarantee or the precise analysis of how cutting the cliques affects the coverage ratio. Those are the parts that could introduce errors or worse constants. Without the full text, it's impossible to tell if the claimed bound is correct or if there's a gap in the argument. The ratio itself is also fairly large for a constant, though still an improvement. This paper is for researchers in approximation algorithms for decision trees and related areas like machine learning and information theory. Someone following the literature on hierarchical clustering would find the connection useful. It deserves a serious referee because it addresses an open question with a new technique. Even with the large constant, confirming the result would move the field forward. I recommend sending it to peer review so the community can check the proof.

Referee Report

1 major / 0 minor

Summary. The manuscript claims to resolve a long-standing open question on the existence of a constant-factor approximation for the average-case Decision Tree problem under uniform probability distribution over hypotheses. It presents a simple polynomial-time algorithm achieving approximation ratio 2/(1-sqrt((e+1)/(2e))) + ε < 11.57, improving on the prior greedy O(log n / log log n) bound. The approach relies on a decomposition technique from hierarchical clustering to break the optimal decision tree into separating subfamilies, followed by a reduction of the Separating Subfamily subproblem to Maximum Coverage by analyzing the properties of cutting cliques into small pieces representing hypothesis pairs to be separated.

Significance. If the claims and analysis hold, this would be a significant contribution to approximation algorithms, as it provides the first constant-factor approximation for uniform Decision Tree where only super-logarithmic ratios were previously known. The reuse of decomposition methods from hierarchical clustering and the reduction to Maximum Coverage demonstrates a potentially elegant combination of existing tools, though the specific preservation of approximation guarantees in the decomposition step is central to the result.

major comments (1)

The central claim depends on the decomposition into separating subfamilies preserving the approximation guarantee and on the correctness of the clique-cutting analysis for the Maximum Coverage reduction (both described only at high level in the abstract); these steps are load-bearing but cannot be verified without the full proof and analysis.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for reviewing our manuscript and for their comments. We address the major comment below.

read point-by-point responses

Referee: The central claim depends on the decomposition into separating subfamilies preserving the approximation guarantee and on the correctness of the clique-cutting analysis for the Maximum Coverage reduction (both described only at high level in the abstract); these steps are load-bearing but cannot be verified without the full proof and analysis.

Authors: The full manuscript contains the complete, formal proofs for both steps. Section 3 details the decomposition technique adapted from hierarchical clustering results, including the argument that it preserves the approximation guarantee when breaking the optimal decision tree into separating subfamilies. Section 4 then presents the reduction of the Separating Subfamily problem to Maximum Coverage, with the clique-cutting analysis that bounds the contribution of each small piece representing a hypothesis pair. These sections provide the rigorous analysis and lemmas supporting the overall constant-factor bound. The abstract summarizes these ideas at a high level, which is standard, but the body supplies the verifiable details. If the referee identifies a specific gap in the provided proofs, we are happy to clarify or expand the relevant section. revision: no

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The abstract describes a derivation that invokes a known decomposition technique from prior Hierarchical Clustering work (cited as SODA '17 and WALCOM '26) to break the optimal decision tree into separating subfamilies, followed by a reduction of the Separating Subfamily subproblem to Maximum Coverage via new analysis of clique-cutting properties. These steps are presented as building on external techniques with independent analysis for the approximation guarantee, without any visible self-definitional reductions, fitted inputs renamed as predictions, or load-bearing self-citations. No equations or detailed derivations are available in the abstract to inspect for constructional equivalence, and the central constant-factor result is framed as arising from the correctness of the cited decomposition and the new clique analysis, rendering the chain self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the correctness of the decomposition technique from hierarchical clustering and the reduction to Maximum Coverage via clique cutting; these are presented as standard tools but their application here requires verification.

axioms (1)

standard math Standard assumptions in approximation algorithms and graph theory hold for the decision tree and clique models.
The paper invokes properties of separating subfamilies and clique cutting without introducing new axioms.

pith-pipeline@v0.9.0 · 5496 in / 1139 out tokens · 58964 ms · 2026-05-10T15:13:01.993431+00:00 · methodology

Constant-Factor Approximation for the Uniform Decision Tree

Core claim

What carries the argument

Load-bearing premise

What would settle it

discussion (0)