pith. sign in

arxiv: 1907.07320 · v1 · pith:DA5F2FXDnew · submitted 2019-07-16 · 🧮 math.ST · math.AC· stat.ME· stat.TH

What is... a Markov basis?

Pith reviewed 2026-05-24 20:55 UTC · model grok-4.3

classification 🧮 math.ST math.ACstat.MEstat.TH
keywords Markov basisalgebraic statisticscontingency tablestoric idealfiber connectednesslattice kernelMCMC moves
0
0 comments X

The pith

A Markov basis is a finite set of integer vectors that connects every pair of non-negative integer solutions to Ax = b for any fixed b.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper supplies a self-contained definition of a Markov basis written for readers whose main background is in pure mathematics. It presents the object as a generating set for the integer kernel of a matrix whose columns index the cells of a contingency table. A sympathetic reader would care because the same object turns a statistical sampling problem into an algebraic one: the moves allow one to travel between all tables that share the same margins without leaving the non-negative lattice. The definition is given directly in terms of fibers and connectedness of graphs on those fibers. This framing makes the statistical use of toric ideals immediately legible to algebraists.

Core claim

A Markov basis for an integer matrix A is any finite subset B of the integer kernel of A such that, for every right-hand side vector b, the graph whose vertices are the non-negative integer solutions to Ax = b and whose edges correspond to adding or subtracting an element of B is connected.

What carries the argument

Markov basis: the finite set of moves that makes the fiber graph connected for every margin vector b.

If this is right

  • Any two tables with the same margins can be reached from each other by a sequence of additions and subtractions of basis elements.
  • The moves generate the lattice kernel and therefore correspond to generators of the associated toric ideal.
  • Markov-chain Monte Carlo algorithms that use only these moves produce samples from the conditional distribution given the margins.
  • The same construction applies to any toric model whose design matrix is A.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same connectedness property could be checked algorithmically for small tables by enumerating fibers.
  • Textbooks on commutative algebra could add this definition as a concrete application of toric ideals to discrete statistics.
  • The minimal size of a Markov basis for a given model remains an open computational question that algebraists are now equipped to attack.

Load-bearing premise

That the algebraic definition of a Markov basis can be stated and motivated without assuming the reader already knows contingency-table models or conditional inference.

What would settle it

A pure mathematician reads the definition, then cannot exhibit even one element of a Markov basis for the independence model on a 2-by-2 table or verify that it connects the two tables with margins (1,1) and (1,1).

read the original abstract

This short piece defines a Markov basis. The aim is to introduce the statistical concept to mathematicians.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 1 minor

Summary. The manuscript is a short expository note whose central claim is that the standard definition of a Markov basis from algebraic statistics can be stated in a self-contained manner that is accessible and meaningful to readers whose primary background is in pure mathematics rather than statistics.

Significance. If the presentation succeeds, the note provides a concise bridge between algebraic statistics and pure mathematics by making the definition of Markov bases available without requiring statistical prerequisites. The paper's strength is its explicit focus on a definitional exposition with no derivations, predictions, or fitted quantities, which aligns with the expository goal and avoids any internal inconsistency or circularity.

minor comments (1)
  1. The abstract could more explicitly indicate the target audience (pure mathematicians) and note that the exposition is limited to the definition itself.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of the manuscript and for recommending acceptance. The report contains no major comments requiring a point-by-point response.

Circularity Check

0 steps flagged

No significant circularity; purely expository definition

full rationale

The paper is an expository note whose sole purpose is to state the standard definition of a Markov basis from algebraic statistics in language accessible to pure mathematicians. No derivations, predictions, fitted quantities, or deductive chains exist in the manuscript. The central content is definitional rather than deductive, with no self-citations serving as load-bearing premises that reduce any claim to its own inputs by construction. The presentation is self-contained against external benchmarks and does not invoke any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper introduces no free parameters, axioms, or invented entities because it is purely expository and definitional.

pith-pipeline@v0.9.0 · 5517 in / 909 out tokens · 23069 ms · 2026-05-24T20:55:45.640331+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.