A Bayesian Approach to Segmentation with Noisy Labels via Spatially Correlated Distributions

Ryu Tadokoro; Shin-ichi Maeda; Tsukasa Takagi

arxiv: 2504.14795 · v3 · submitted 2025-04-21 · 📡 eess.IV · cs.CV· cs.LG· stat.ML

A Bayesian Approach to Segmentation with Noisy Labels via Spatially Correlated Distributions

Ryu Tadokoro , Tsukasa Takagi , Shin-ichi Maeda This is my paper

Pith reviewed 2026-05-22 19:18 UTC · model grok-4.3

classification 📡 eess.IV cs.CVcs.LGstat.ML

keywords semantic segmentationnoisy labelsBayesian inferencespatial correlationvariational inferencelatent Gaussian fieldmedical imagingremote sensing

0 comments

The pith

Modeling spatial correlations in label errors lets Bayesian segmentation match clean-label performance in tasks like lung scans.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a Bayesian method for semantic segmentation that explicitly accounts for label errors clustering in connected regions rather than occurring independently. It introduces the ELBO-Computable Correlated Discrete Distribution to represent these discrete errors through a continuous latent Gaussian field equipped with Kac-Murdock-Szegő structured covariance. This structure makes variational inference tractable for problems that were previously prohibitive. Experiments across segmentation tasks show that exploiting the spatial structure yields measurable gains, and in lung segmentation the results reach levels comparable to training on clean labels when noise is moderate.

Core claim

The central claim is that approximate Bayesian estimation incorporating a probabilistic model of spatially correlated label errors becomes feasible through the ELBO-Computable Correlated Discrete Distribution, which represents discrete dependencies via a continuous latent Gaussian field with Kac-Murdock-Szegő structured covariance, enabling scalable variational inference and improved segmentation accuracy on noisy training data.

What carries the argument

The ELBO-Computable Correlated Discrete Distribution (ECCD), which models discrete spatially correlated label errors by a continuous latent Gaussian field with Kac-Murdock-Szegő covariance to support tractable variational inference.

If this is right

Accounting for spatial correlations in label noise produces significant accuracy gains over methods that treat errors as independent.
Under moderate noise the method reaches performance comparable to clean-label training in lung segmentation.
The ECCD construction renders variational inference practical for discrete variables with spatial dependencies that were previously intractable.
The approach applies directly to medical imaging and remote sensing where annotation errors commonly form connected regions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The latent-field construction could be adapted to other vision tasks that involve spatially structured noise such as boundary detection or semantic labeling of video.
Explicit covariance structure may allow derivation of closed-form noise statistics that standard independent-noise models cannot provide.
In annotation pipelines the method could lower verification costs by tolerating clustered errors without manual correction.

Load-bearing premise

Label errors occur with spatial correlations between adjacent pixels that a continuous latent Gaussian field with Kac-Murdock-Szegő structured covariance can adequately represent.

What would settle it

Running the method on synthetic segmentation data where label flips are generated independently at each pixel with no spatial clustering; if the performance advantage over standard noisy-label baselines vanishes, the spatial-correlation modeling is not delivering the claimed benefit.

read the original abstract

In semantic segmentation, the accuracy of models heavily depends on the high-quality annotations. However, in many practical scenarios, such as medical imaging and remote sensing, obtaining true annotations is not straightforward and usually requires significant human labor. Relying on human labor often introduces annotation errors, including mislabeling, omissions, and inconsistency between annotators. In the case of remote sensing, differences in procurement time can lead to misaligned ground-truth annotations. These label errors are not independently distributed, and instead usually appear in spatially connected regions where adjacent pixels are more likely to share the same errors. To address these issues, we propose an approximate Bayesian estimation based on a probabilistic model that assumes training data include label errors, incorporating the tendency for these errors to occur with spatial correlations between adjacent pixels. However, Bayesian inference for such spatially correlated discrete variables is notoriously intractable. To overcome this fundamental challenge, we introduce a novel class of probabilistic models, which we term the ELBO-Computable Correlated Discrete Distribution (ECCD). By representing the discrete dependencies through a continuous latent Gaussian field with a Kac-Murdock-Szeg\"{o} (KMS) structured covariance, our framework enables scalable and efficient variational inference for problems previously considered computationally prohibitive. Through experiments on multiple segmentation tasks, we confirm that leveraging the spatial correlation of label errors significantly improves performance. Notably, in specific tasks such as lung segmentation, the proposed method achieves performance comparable to training with clean labels under moderate noise levels. Code is available at https://github.com/pfnet-research/Bayesian_SpatialCorr.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper makes spatially correlated noisy-label segmentation tractable with a new ECCD distribution built on a KMS Gaussian field, and the experiments show real gains, but the covariance choice looks too restrictive for typical 2D annotation noise.

read the letter

The main point is that they have found a workable way to do approximate Bayesian inference when label errors cluster spatially instead of hitting pixels independently. By introducing the ECCD class and routing the discrete dependencies through a latent Gaussian field with Kac-Murdock-Szegő covariance, they turn an otherwise intractable problem into something that supports scalable variational inference. That modeling step is the actual novelty here, and it goes beyond the usual independent-noise or standard variational baselines cited in the abstract.

Referee Report

1 major / 2 minor

Summary. The manuscript proposes a Bayesian method for semantic segmentation under noisy labels that exhibit spatial correlations. It introduces the ELBO-Computable Correlated Discrete Distribution (ECCD), which models label errors using a latent Gaussian field equipped with Kac-Murdock-Szegő (KMS) covariance to enable tractable variational inference. Experiments across several segmentation tasks indicate that accounting for spatial label noise correlations yields performance gains, and in lung segmentation, achieves results comparable to training on clean labels at moderate noise levels.

Significance. Should the modeling assumptions prove valid and the gains robust to variations in noise structure, the work offers a valuable contribution to handling realistic label noise in medical and remote sensing imagery. The ECCD construction provides a new tool for approximate Bayesian inference on correlated discrete variables, and the public code release supports reproducibility.

major comments (1)

The central modeling choice restricts the latent Gaussian field's covariance to the Kac-Murdock-Szegő form, which corresponds to a 1D stationary AR(1) process. For 2D image grids, this imposes a separable correlation pattern that may fail to capture isotropic, diagonal, or blob-like error structures common in annotation noise. If this restriction leads to misspecification, the performance improvements over independent-noise baselines could be artifacts rather than evidence of effective spatial correlation exploitation. A concrete test would be to compare against a more flexible covariance (e.g., Matérn or learned) or to characterize the empirical correlation structure of the label errors in the datasets.

minor comments (2)

The abstract states consistent performance gains but does not report specific metrics, baselines, or ablation results; including key quantitative findings would strengthen the summary.
Clarify the exact parameterization of the KMS covariance matrix and how it is applied to 2D grids (e.g., row-wise or separable).

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive and detailed review. We address the major comment below and indicate the revisions we intend to incorporate.

read point-by-point responses

Referee: The central modeling choice restricts the latent Gaussian field's covariance to the Kac-Murdock-Szegő form, which corresponds to a 1D stationary AR(1) process. For 2D image grids, this imposes a separable correlation pattern that may fail to capture isotropic, diagonal, or blob-like error structures common in annotation noise. If this restriction leads to misspecification, the performance improvements over independent-noise baselines could be artifacts rather than evidence of effective spatial correlation exploitation. A concrete test would be to compare against a more flexible covariance (e.g., Matérn or learned) or to characterize the empirical correlation structure of the label errors in the datasets.

Authors: The KMS covariance was deliberately chosen to ensure that the variational lower bound remains analytically tractable and computationally scalable for image-sized grids; more flexible kernels such as Matérn would generally destroy the closed-form or low-cost matrix operations required for the ECCD construction. We acknowledge that the resulting separable structure is a modeling restriction and may not capture every possible spatial pattern of annotation noise. Nevertheless, the consistent gains over independent-noise baselines across multiple tasks indicate that the captured correlations are not merely artifacts. In the revised manuscript we will add an explicit discussion of this limitation together with an analysis of the empirical label-error correlation structures observed in the datasets, and we will explore whether a limited comparison to a Matérn-based variant is feasible without sacrificing scalability. revision: partial

Circularity Check

0 steps flagged

No circularity: novel model construction and experimental validation are independent

full rationale

The paper defines a new model family (ECCD) by positing a latent Gaussian field with KMS covariance to represent spatially correlated discrete label errors, then derives tractable variational inference from that modeling choice. This is an explicit ansatz and construction rather than a derivation that reduces to its own inputs by construction. Performance gains are reported from external experiments on segmentation tasks with comparisons to clean-label training, which do not loop back to the model equations or any fitted parameter. No self-definitional steps, fitted-input predictions, or load-bearing self-citations appear in the provided derivation chain.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The approach rests on the newly introduced ECCD model that converts discrete spatial correlations into a continuous Gaussian field; standard variational inference assumptions are invoked to obtain the ELBO.

free parameters (1)

KMS covariance hyperparameters
Parameters controlling the structured covariance of the latent Gaussian field; chosen or optimized to match observed spatial error patterns.

axioms (1)

domain assumption Variational inference yields a sufficiently accurate approximation to the intractable posterior over the correlated discrete labels.
Invoked to justify use of the ELBO for scalable training.

invented entities (1)

ECCD (ELBO-Computable Correlated Discrete Distribution) no independent evidence
purpose: To represent spatially correlated discrete label errors in a form that permits efficient variational inference.
Newly defined in the paper to overcome computational intractability of direct modeling.

pith-pipeline@v0.9.0 · 5830 in / 1395 out tokens · 56013 ms · 2026-05-22T19:18:59.069029+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

By representing the discrete dependencies through a continuous latent Gaussian field with a Kac-Murdock-Szegö (KMS) structured covariance, our framework enables scalable and efficient variational inference...
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

we leverage the Kac-Murdock-Szegö (KMS) matrix... Rρ and R−1ρ are given by... determinant |Rρ| = (1−ρ²)^{n−1}

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.