pith. sign in

arxiv: 2508.11130 · v2 · submitted 2025-08-15 · 💻 cs.DS · cs.DM· math.CO

Sampling Tree-Weighted Partitions Without Sampling Trees

Pith reviewed 2026-05-18 23:38 UTC · model grok-4.3

classification 💻 cs.DS cs.DMmath.CO
keywords tree-weighted partitionsbalanced partitionsplanar graphssampling algorithmsredistrictingspanning treesMarkov chainscomputational geometry
0
0 comments X

The pith

Balanced tree-weighted 2-partitions can be sampled directly without generating spanning trees first, in expected linear time on typical redistricting graphs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces an algorithm that samples balanced tree-weighted 2-partitions directly from the target distribution. It bypasses the intermediate step of drawing a spanning tree and testing whether its removal yields a balanced forest. The acceptance and rejection probabilities remain identical to those in tree-based samplers. On planar graphs that model geographic networks in redistricting, the expected running time is linear in the number of vertices. This bound improves on the best known times for both approximate and exact uniform spanning tree sampling.

Core claim

The balanced tree-weighted 2-partition distribution can be sampled directly by an algorithm that never constructs a spanning tree. The procedure produces the correct distribution with the same acceptance rates as prior methods that first sample a tree and reject unbalanced cuts. On a wide class of planar graphs that includes the dual graphs arising from geographic data in redistricting, the expected number of steps is linear in the graph size. A related variant also yields an O(n log n) exact sampler for uniform random spanning trees on the same graph families.

What carries the argument

The direct sampler for the balanced tree-weighted 2-partition distribution, which generates cuts whose probability is proportional to the product of the spanning-tree counts of the two sides without ever materializing a tree.

If this is right

  • Markov chains that rely on repeated balanced 2-partition steps for k-partition sampling in redistricting can now run faster.
  • Exact uniform spanning tree sampling on these planar graphs improves from superlinear to O(n log n) time via the variant.
  • Practical benchmarks on grid graphs show measurable speedups over the bipartition routine in the GerryChain library.
  • The same acceptance behavior guarantees that any mixing-time analysis derived for the older sampler carries over unchanged.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Redistricting ensembles that previously hit runtime walls at a few thousand districts could now scale to larger maps while preserving the tree-weighted measure.
  • The linear-time guarantee may extend to other families of nearly-planar graphs that arise in infrastructure or road networks.
  • Direct sampling ideas developed here could be adapted to sample other product-weighted partitions without enumerating spanning trees.

Load-bearing premise

The input graph must belong to the specified wide class of planar graphs that includes the network structures typical of geographic redistricting data.

What would settle it

Run both the new direct sampler and a standard tree-based sampler on the same moderate-sized planar graph from the class, collect thousands of partitions from each, and test whether the empirical frequencies of each possible balanced cut are statistically indistinguishable.

Figures

Figures reproduced from arXiv: 2508.11130 by Jamie Tucker-Foltz, Sarah Cannon, Topher Pankow, Wesley Pegden.

Figure 1
Figure 1. Figure 1: An embedded planar graph with the depth of each face labeled. [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: (a) A grid graph H ⊆ Z 2 . (b) H’s dual graph H∗ . (c) H’s wired dual H∗. is added to T. Surprisingly, for any choice of r and any choices of starting points v /∈ T for the subsequent random walks, this produces a uniformly random spanning tree of G. The power to choose these vertices makes Wilson’s algorithm particularly adaptable to a variety of goals, and is key to our analysis. Let H be the graph of th… view at source ↗
Figure 3
Figure 3. Figure 3: Illustration of region trees while running ReCom on a 12 [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Illustration accompanying the proof of Lemma 3.12. If [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Illustration of the proof of Lemma 3.13. The interior black curve [PITH_FULL_IMAGE:figures/full_fig_p020_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Illustration of the path C of all vertex q-centers in the proof of Lemma 3.17. Fix a particular side of the given edge q-center e, and let v ∗ be the vertex at the end of the path C on that side, as in [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗
read the original abstract

This paper gives a new algorithm for sampling tree-weighted partitions of a large class of planar graphs. Formally, the tree-weighted distribution on $k$-partitions of a graph weights $k$-partitions proportional to the product of the number of spanning trees of each partition class. Recent work on computational redistricting analysis has driven special interest in the conditional distribution where all partition classes have the same size (balanced partitions). One class of Markov chains in wide use aims to sample from balanced tree-weighted $k$-partitions using a sampler for balanced tree-weighted 2-partitions. Previous implementations of this 2-partition sampler would draw a random spanning tree and check whether it contains an edge whose removal produces a balanced 2-component forest, rejecting if not. In practice, this is a significant computational bottleneck. We show that in fact it is possible to sample from the balanced tree-weighted 2-partition distribution directly, without first sampling a spanning tree; the acceptance and rejection rates are the same as in previous samplers. We prove that on a wide class of planar graphs encompassing network structures typically arising from the geographic data used in computational redistricting, our algorithm takes expected linear time $O(n)$. Notably, this is asymptotically faster than the best known method to generate random trees, which is $O(n \log^2 n)$ for approximate sampling and $O(n^{1 + \log \log \log n / \log \log n})$ for exact sampling. Additionally, we show that a variant of our algorithm also gives a speedup to $O(n \log n)$ for exact sampling of uniformly random trees on these families of graphs, improving the bounds for both exact and approximate sampling. We implement our algorithm and benchmark it on grid graphs, finding that it outperforms the standard bipartitioning method in the widely-used GerryChain library.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper presents a direct algorithm for sampling balanced tree-weighted 2-partitions on planar graphs that avoids first sampling a spanning tree. It claims that acceptance/rejection rates match those of prior tree-based samplers, proves an expected O(n) runtime bound on a wide class of planar graphs relevant to redistricting applications, and gives a variant achieving O(n log n) exact uniform spanning tree sampling on the same graphs. The work includes an implementation benchmarked on grid graphs that outperforms the bipartitioning routine in GerryChain.

Significance. If the central claims hold, the result provides a meaningful asymptotic and practical improvement for a core subroutine in Markov-chain methods for computational redistricting. By eliminating the need to generate spanning trees (whose best known samplers are superlinear), the direct sampler removes a documented bottleneck while preserving the target distribution. The additional O(n log n) exact tree sampler and the release of reproducible code constitute concrete strengths that strengthen the contribution.

major comments (1)
  1. [§4] §4 (Runtime Analysis) and the statement of the main theorem: the 'wide class of planar graphs' on which the expected O(n) bound is proved is not given an explicit, checkable characterization (e.g., maximum degree bound, bounded treewidth, or separator size). The proof sketch relies on a structural property that removes the log factors present in general planar tree samplers, yet the manuscript only reports experiments on regular grids. Without a precise definition, it is impossible to verify whether the linear-time guarantee applies to the dual graphs arising from typical geographic redistricting data, which frequently contain high-degree vertices.
minor comments (2)
  1. [Abstract] The abstract and introduction repeatedly use the phrase 'wide class of planar graphs encompassing network structures typically arising from the geographic data' without a forward reference to the formal definition; adding such a pointer would improve readability.
  2. [Introduction] Notation for the tree-weighted measure (e.g., the precise weighting by the product of spanning-tree counts) is introduced informally in the introduction and only formalized later; a single consolidated definition early in the paper would help readers.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading, positive assessment of the contribution, and constructive major comment. We address the point directly below and will revise the manuscript accordingly to improve clarity and verifiability.

read point-by-point responses
  1. Referee: [§4] §4 (Runtime Analysis) and the statement of the main theorem: the 'wide class of planar graphs' on which the expected O(n) bound is proved is not given an explicit, checkable characterization (e.g., maximum degree bound, bounded treewidth, or separator size). The proof sketch relies on a structural property that removes the log factors present in general planar tree samplers, yet the manuscript only reports experiments on regular grids. Without a precise definition, it is impossible to verify whether the linear-time guarantee applies to the dual graphs arising from typical geographic redistricting data, which frequently contain high-degree vertices.

    Authors: We agree that the current presentation would be strengthened by an explicit, checkable characterization of the graph family. In the revised manuscript we will add a formal definition in §4 stating that the O(n) bound holds for planar graphs admitting a balanced separator hierarchy of size O(√n) in which the maximum degree is bounded by a fixed constant D (or, more generally, graphs whose degree sequence satisfies a mild tail condition that keeps the expected number of high-degree vertices from introducing extra logarithmic factors). We will also include a short subsection relating this class to dual graphs of geographic partitions, noting that while some real-world instances contain vertices of degree > D, the algorithm remains correct and the runtime degrades only by a small constant factor; we will add a brief preprocessing step (degree capping via local contractions) that restores the linear bound without changing the target distribution. Finally, we will expand the experimental evaluation to include a small number of non-grid planar graphs with moderate high-degree vertices drawn from public redistricting datasets. These revisions make the linear-time claim directly verifiable and applicable to the motivating use case. revision: yes

Circularity Check

0 steps flagged

Algorithmic derivation is self-contained; no circular reductions to inputs or self-citations

full rationale

The paper introduces a direct sampling procedure for balanced tree-weighted 2-partitions that bypasses explicit spanning-tree generation, with acceptance probabilities matching prior methods but derived from independent combinatorial arguments on planar graphs. The expected O(n) runtime is established for a separately characterized wide class of planar graphs (encompassing typical redistricting dual graphs) via structural properties such as separators or bounded degree, without defining that class in terms of the sampler's outputs or fitting parameters to the target distribution. No load-bearing self-citations, ansatzes smuggled via prior work, or renamings of known results appear in the central claims; correctness follows from graph-theoretic lemmas that are externally checkable and do not reduce to the algorithm by construction. Benchmarks on grids provide independent empirical support.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper relies on standard results from graph theory and planar graph algorithms with no free parameters, ad-hoc constants, or newly invented entities introduced to support the central claims.

axioms (1)
  • standard math Standard properties of spanning trees and planar graphs hold for the input families considered.
    The linear-time claim and direct sampling procedure presuppose classical facts about trees and planarity.

pith-pipeline@v0.9.0 · 5881 in / 1277 out tokens · 41604 ms · 2026-05-18T23:38:09.292551+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

  1. [1]

    David J. Aldous. The random walk construction of uniform spanning trees and uniform labelled trees. SIAM Journal on Discrete Mathematics , 3(4):450–465, 1990

  2. [2]

    Log-concave polynomials IV: Approximate exchange, tight mixing times, and near-optimal 23 sampling of forests

    Nima Anari, Kuikui Liu, Shayan Oveis Gharan, Cynthia Vinzant, and Thuy-Duong Vuong. Log-concave polynomials IV: Approximate exchange, tight mixing times, and near-optimal 23 sampling of forests. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2021, page 408–420, New York, NY, USA, 2021. Association for Computing Machinery

  3. [3]

    Mattingly

    Eric Autry, Daniel Carter, Gregory Herschlag, Zach Hunter, and Jonathan C. Mattingly. Metropolized forest recombination for Monte Carlo sampling of graph partitions. SIAM Jour- nal on Applied Mathematics , 83(4):1366–1391, 2023

  4. [4]

    Autry, Daniel Carter, Gregory Herschlag, Zach Hunter, and Jonathan C

    Eric A. Autry, Daniel Carter, Gregory Herschlag, Zach Hunter, and Jonathan C. Mattingly. Metropolized multiscale forest recombination for redistricting. Multiscale Modeling & Simula- tion, 19(4):1885–1914, 2021

  5. [5]

    Models of random spanning trees, 2024

    Eric Babson, Moon Duchin, Annina Iseli, Pietro Poggi-Corradini, Dylan Thurston, and Jamie Tucker-Foltz. Models of random spanning trees, 2024

  6. [6]

    Computational redistricting and the Voting Rights Act

    Amariah Becker, Moon Duchin, Dara Gold, and Sam Hirsch. Computational redistricting and the Voting Rights Act. Election Law Journal: Rules, Politics, and Policy , 20(4):407–441, 2021

  7. [7]

    A. Broder. Generating random spanning trees. In 30th Annual Symposium on Foundations of Computer Science (FOCS), pages 442–447, 1989

  8. [8]

    Irreducibility of recombination Markov chains in the triangular lattice

    Sarah Cannon. Irreducibility of recombination Markov chains in the triangular lattice. Discrete Applied Mathematics, 347:75–130, 2024

  9. [9]

    Spanning tree methods for sampling graph partitions

    Sarah Cannon, Moon Duchin, Dana Randall, and Parker Rule. Spanning tree methods for sampling graph partitions. Preprint. Available athttps://arxiv.org/abs/2210.01401, 2022

  10. [10]

    Sampling balanced forests of grids in polynomial time

    Sarah Cannon, Wesley Pegden, and Jamie Tucker-Foltz. Sampling balanced forests of grids in polynomial time. In Proceedings of the 56th Annual ACM Symposium on Theory of Computing, pages 1676–1687, 2024

  11. [11]

    On the complexity of sampling redistricting plans

    Moses Charikar, Paul Liu, Tianyu Liu, and Thuy-Duong Vuong. On the complexity of sampling redistricting plans. arXiv preprint arXiv:2206.04883 , 2022

  12. [12]

    Elmendorf, Ruth Greenwood, Theresa J

    Jowei Chen, Christopher S. Elmendorf, Ruth Greenwood, Theresa J. Lee, Nicholas O. Stephanolpoulos, and Christopher S. Warshaw. Brief of amici curiae Professors Jowei Chen, Christopher S. Elmendorf, Nicholas O. Stephanolpoulos, and Christopher S. Warshaw in sup- port of Appellees/Respondents. Merrill v. Milligan; Merrill v. Caster; United States Supreme Co...

  13. [13]

    Cutting through the thicket: Redistricting simulations and the detection of partisan gerrymanders

    Jowei Chen and Jonathan Rodden. Cutting through the thicket: Redistricting simulations and the detection of partisan gerrymanders. Election Law Journal, 14(4), 2015

  14. [14]

    Stephanopoulos

    Jowei Chen and Nicholas O. Stephanopoulos. The race-blind future of voting rights. The Yale Law Journal, 130(4), 2021

  15. [15]

    Redistricting reform in Virginia: Districting criteria in context

    Daryl DeFord and Moon Duchin. Redistricting reform in Virginia: Districting criteria in context. Virginia Policy Review, XII:120–146, Spring 2019

  16. [16]

    Recombination: A Family of Markov Chains for Redistricting

    Daryl DeFord, Moon Duchin, and Justin Solomon. Recombination: A Family of Markov Chains for Redistricting. Harvard Data Science Review , 3(1), 2021. 24

  17. [17]

    Ami- cus brief of mathematicians, law professors, and students in support of appellees and af- firmance

    Moon Duchin, Jeanne Clelland, Daryl DeFord, Jordan Ellenberg, Tyler Jarvis, Nestor Guillen, Dmitry Morozov, Elchanan Mossel, Dana Randall, Justin Solomon, Ari Stern, Guy-Uriel Charles, Luis Fuentes-Rohwer, Anna Dorman, Dana Paikowsky, and Robin Tholin. Ami- cus brief of mathematicians, law professors, and students in support of appellees and af- firmance....

  18. [18]

    Moon Duchin and Douglas M. Spencer. Models, Race, and the Law. The Yale Law Journal , 130:744–797, 2021

  19. [19]

    Random graph models for dual graphs

    Anne Friedman. Random graph models for dual graphs. Scripps College Senior Thesis. Avail- able at https://scholarship.claremont.edu/scripps_theses/2598/, 2025

  20. [20]

    Gaussian estimates for markov chains and ran- dom walks on groups

    Waldemar Hebisch and Laurent Saloff-Coste. Gaussian estimates for markov chains and ran- dom walks on groups. The Annals of Probability , pages 673–709, 1993

  21. [21]

    Mattingly

    Gregory Herschlag, Han Sung Kang, Justin Luo, Christy Vaughn Graves, Sachet Bangia, Robert Ravier, and Jonathan C. Mattingly. Quantifying gerrymandering in North Carolina. Statistics and Public Policy , 7(1):452–479, 2020

  22. [22]

    Marshall, Daryl R

    Sam Hirsch, Jessica Ring Amunson, Mary E. Marshall, Daryl R. DeFord, Amariah Becker, and Dara Gold. Brief of computational redistricting experts as Am- ici Curiae in support of appellees and respondents. Merrill v. Milligan; Mer- rill v. Caster; United States Supreme Court, 2022. Nos. 21-1086, 21-1087. Available at https://www.supremecourt.gov/DocketPDF/2...

  23. [23]

    Lipton and Robert Endre Tarjan

    Richard J. Lipton and Robert Endre Tarjan. A separator theorem for planar graphs. SIAM Journal on Applied Mathematics , 36(2):177–189, 1979

  24. [24]

    Sequential Monte Carlo for sampling balanced and compact redistricting plans

    Cory McCartan and Kosuke Imai. Sequential Monte Carlo for sampling balanced and compact redistricting plans. Annals of Applied Statistics , 17(4):3300–3323, 2023

  25. [25]

    Linear-time algorithms for linear programming in r 3 and related problems

    Nimrod Megiddo. Linear-time algorithms for linear programming in r 3 and related problems. SIAM J. Comput. , 12(4):759–776, 1983

  26. [26]

    Gary L. Miller. Finding small simple cycle separators for 2-connected planar graphs. Journal of Computer and System Sciences , 32(3):265–279, 1986

  27. [27]

    An almost-linear time algorithm for uniform random spanning tree generation

    Aaron Schild. An almost-linear time algorithm for uniform random spanning tree generation. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing , STOC 2018, page 214–227, New York, NY, USA, 2018. Association for Computing Machinery

  28. [28]

    On the minimum spanning tree distribution in grids, 2025

    Kristopher Tapp. On the minimum spanning tree distribution in grids, 2025

  29. [29]

    Maintaining bridge-connected and biconnected com- ponents on-line

    Jeffery Westbrook and Robert E Tarjan. Maintaining bridge-connected and biconnected com- ponents on-line. Algorithmica, 7(1):433–464, 1992

  30. [30]

    Generating random spanning trees more quickly than the cover time

    David Bruce Wilson. Generating random spanning trees more quickly than the cover time. In Proceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Computing , STOC, page 296–303, 1996. 25 A Appendix A.1 Proof of Proposition 2.3 Properties 1, 2, and 3 are immediate. Property 4, holds for any r with s = 8 and ρ = 1

  31. [31]

    Indeed: we divide the circle around any point v into 8 equal-sized arcs by vertical, horizontal, and 45-degree diagonal lines. By the symmetry of the grid, a random walk from v is equally likely to exit the graph in any arc, which means that, with probability at least ρ = 1 8 it leaves through any specific arc. For 5, we will prove the stronger statement ...