A Bayesian Approach to Segmentation with Noisy Labels via Spatially Correlated Distributions
Pith reviewed 2026-05-22 19:18 UTC · model grok-4.3
The pith
Modeling spatial correlations in label errors lets Bayesian segmentation match clean-label performance in tasks like lung scans.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that approximate Bayesian estimation incorporating a probabilistic model of spatially correlated label errors becomes feasible through the ELBO-Computable Correlated Discrete Distribution, which represents discrete dependencies via a continuous latent Gaussian field with Kac-Murdock-Szegő structured covariance, enabling scalable variational inference and improved segmentation accuracy on noisy training data.
What carries the argument
The ELBO-Computable Correlated Discrete Distribution (ECCD), which models discrete spatially correlated label errors by a continuous latent Gaussian field with Kac-Murdock-Szegő covariance to support tractable variational inference.
If this is right
- Accounting for spatial correlations in label noise produces significant accuracy gains over methods that treat errors as independent.
- Under moderate noise the method reaches performance comparable to clean-label training in lung segmentation.
- The ECCD construction renders variational inference practical for discrete variables with spatial dependencies that were previously intractable.
- The approach applies directly to medical imaging and remote sensing where annotation errors commonly form connected regions.
Where Pith is reading between the lines
- The latent-field construction could be adapted to other vision tasks that involve spatially structured noise such as boundary detection or semantic labeling of video.
- Explicit covariance structure may allow derivation of closed-form noise statistics that standard independent-noise models cannot provide.
- In annotation pipelines the method could lower verification costs by tolerating clustered errors without manual correction.
Load-bearing premise
Label errors occur with spatial correlations between adjacent pixels that a continuous latent Gaussian field with Kac-Murdock-Szegő structured covariance can adequately represent.
What would settle it
Running the method on synthetic segmentation data where label flips are generated independently at each pixel with no spatial clustering; if the performance advantage over standard noisy-label baselines vanishes, the spatial-correlation modeling is not delivering the claimed benefit.
read the original abstract
In semantic segmentation, the accuracy of models heavily depends on the high-quality annotations. However, in many practical scenarios, such as medical imaging and remote sensing, obtaining true annotations is not straightforward and usually requires significant human labor. Relying on human labor often introduces annotation errors, including mislabeling, omissions, and inconsistency between annotators. In the case of remote sensing, differences in procurement time can lead to misaligned ground-truth annotations. These label errors are not independently distributed, and instead usually appear in spatially connected regions where adjacent pixels are more likely to share the same errors. To address these issues, we propose an approximate Bayesian estimation based on a probabilistic model that assumes training data include label errors, incorporating the tendency for these errors to occur with spatial correlations between adjacent pixels. However, Bayesian inference for such spatially correlated discrete variables is notoriously intractable. To overcome this fundamental challenge, we introduce a novel class of probabilistic models, which we term the ELBO-Computable Correlated Discrete Distribution (ECCD). By representing the discrete dependencies through a continuous latent Gaussian field with a Kac-Murdock-Szeg\"{o} (KMS) structured covariance, our framework enables scalable and efficient variational inference for problems previously considered computationally prohibitive. Through experiments on multiple segmentation tasks, we confirm that leveraging the spatial correlation of label errors significantly improves performance. Notably, in specific tasks such as lung segmentation, the proposed method achieves performance comparable to training with clean labels under moderate noise levels. Code is available at https://github.com/pfnet-research/Bayesian_SpatialCorr.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a Bayesian method for semantic segmentation under noisy labels that exhibit spatial correlations. It introduces the ELBO-Computable Correlated Discrete Distribution (ECCD), which models label errors using a latent Gaussian field equipped with Kac-Murdock-Szegő (KMS) covariance to enable tractable variational inference. Experiments across several segmentation tasks indicate that accounting for spatial label noise correlations yields performance gains, and in lung segmentation, achieves results comparable to training on clean labels at moderate noise levels.
Significance. Should the modeling assumptions prove valid and the gains robust to variations in noise structure, the work offers a valuable contribution to handling realistic label noise in medical and remote sensing imagery. The ECCD construction provides a new tool for approximate Bayesian inference on correlated discrete variables, and the public code release supports reproducibility.
major comments (1)
- The central modeling choice restricts the latent Gaussian field's covariance to the Kac-Murdock-Szegő form, which corresponds to a 1D stationary AR(1) process. For 2D image grids, this imposes a separable correlation pattern that may fail to capture isotropic, diagonal, or blob-like error structures common in annotation noise. If this restriction leads to misspecification, the performance improvements over independent-noise baselines could be artifacts rather than evidence of effective spatial correlation exploitation. A concrete test would be to compare against a more flexible covariance (e.g., Matérn or learned) or to characterize the empirical correlation structure of the label errors in the datasets.
minor comments (2)
- The abstract states consistent performance gains but does not report specific metrics, baselines, or ablation results; including key quantitative findings would strengthen the summary.
- Clarify the exact parameterization of the KMS covariance matrix and how it is applied to 2D grids (e.g., row-wise or separable).
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed review. We address the major comment below and indicate the revisions we intend to incorporate.
read point-by-point responses
-
Referee: The central modeling choice restricts the latent Gaussian field's covariance to the Kac-Murdock-Szegő form, which corresponds to a 1D stationary AR(1) process. For 2D image grids, this imposes a separable correlation pattern that may fail to capture isotropic, diagonal, or blob-like error structures common in annotation noise. If this restriction leads to misspecification, the performance improvements over independent-noise baselines could be artifacts rather than evidence of effective spatial correlation exploitation. A concrete test would be to compare against a more flexible covariance (e.g., Matérn or learned) or to characterize the empirical correlation structure of the label errors in the datasets.
Authors: The KMS covariance was deliberately chosen to ensure that the variational lower bound remains analytically tractable and computationally scalable for image-sized grids; more flexible kernels such as Matérn would generally destroy the closed-form or low-cost matrix operations required for the ECCD construction. We acknowledge that the resulting separable structure is a modeling restriction and may not capture every possible spatial pattern of annotation noise. Nevertheless, the consistent gains over independent-noise baselines across multiple tasks indicate that the captured correlations are not merely artifacts. In the revised manuscript we will add an explicit discussion of this limitation together with an analysis of the empirical label-error correlation structures observed in the datasets, and we will explore whether a limited comparison to a Matérn-based variant is feasible without sacrificing scalability. revision: partial
Circularity Check
No circularity: novel model construction and experimental validation are independent
full rationale
The paper defines a new model family (ECCD) by positing a latent Gaussian field with KMS covariance to represent spatially correlated discrete label errors, then derives tractable variational inference from that modeling choice. This is an explicit ansatz and construction rather than a derivation that reduces to its own inputs by construction. Performance gains are reported from external experiments on segmentation tasks with comparisons to clean-label training, which do not loop back to the model equations or any fitted parameter. No self-definitional steps, fitted-input predictions, or load-bearing self-citations appear in the provided derivation chain.
Axiom & Free-Parameter Ledger
free parameters (1)
- KMS covariance hyperparameters
axioms (1)
- domain assumption Variational inference yields a sufficiently accurate approximation to the intractable posterior over the correlated discrete labels.
invented entities (1)
-
ECCD (ELBO-Computable Correlated Discrete Distribution)
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
By representing the discrete dependencies through a continuous latent Gaussian field with a Kac-Murdock-Szegö (KMS) structured covariance, our framework enables scalable and efficient variational inference...
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
we leverage the Kac-Murdock-Szegö (KMS) matrix... Rρ and R−1ρ are given by... determinant |Rρ| = (1−ρ²)^{n−1}
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.