When Brains Disagree: Biological Ambiguity Underlies the Challenge of Amyloid PET Synthesis from Structural MRI

David M. Cash; Hojjat Azadbakht; Hui Zhang; Louise E. G. Baron; Philip S. J. Weston; Ross Callaghan

arxiv: 2605.11867 · v2 · pith:J2IKCVXEnew · submitted 2026-05-12 · 💻 cs.CV

When Brains Disagree: Biological Ambiguity Underlies the Challenge of Amyloid PET Synthesis from Structural MRI

Louise E. G. Baron , Ross Callaghan , David M. Cash , Philip S. J. Weston , Hojjat Azadbakht , Hui Zhang This is my paper

Pith reviewed 2026-05-13 07:17 UTC · model grok-4.3

classification 💻 cs.CV

keywords MRI-to-PET synthesisAmyloid PETAlzheimer's diseaseBiological ambiguityMultimodal biomarkersNeurodegenerationIll-posed mappingPlasma biomarkers

0 comments

The pith

Biological ambiguity from decoupled brain processes makes MRI-to-amyloid PET synthesis inherently inconsistent.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether widely varying results in generating amyloid PET scans from structural MRI arise because the two modalities capture different biological processes that do not always align. MRI primarily shows neurodegeneration such as brain shrinkage, while PET measures amyloid plaque buildup, and these changes can occur at different times in Alzheimer's disease. This creates cases where identical MRI patterns map to multiple possible PET outcomes. Experiments demonstrate that models learn the mapping reliably when data is restricted to unambiguous cases stratified by disease status, but accuracy drops sharply once ambiguous cases are mixed in. Adding plasma biomarkers supplies the missing context and restores stable performance.

Core claim

MRI-to-amyloid PET synthesis is intrinsically ill-posed because similar MRI patterns can correspond to different amyloid states due to the temporal decoupling of neurodegeneration and amyloid pathology. When paired data are stratified by amyloid and neurodegeneration status, standard synthesis models learn unambiguous mappings with high performance; introducing ambiguous data causes performance to collapse regardless of model architecture. Incorporating orthogonal information from plasma biomarkers resolves the ambiguity, yielding improved and more stable synthesis results.

What carries the argument

Stratification of paired MRI-PET data by amyloid and neurodegeneration status to isolate unambiguous one-to-one mappings, with multimodal plasma biomarkers serving as disambiguating inputs.

If this is right

Unambiguous MRI-PET mappings become learnable once data ambiguity is removed through stratification.
Performance collapse occurs specifically when biologically ambiguous cases enter the training distribution.
Multimodal inputs such as plasma biomarkers restore accuracy and consistency by resolving one-to-many mappings.
Architectural complexity alone cannot overcome the limits imposed by data ambiguity.
Meaningful advances require combining imaging with additional biological signals rather than MRI synthesis in isolation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Purely image-based synthesis methods may reach a performance ceiling in Alzheimer's research unless fluid biomarkers or other modalities are routinely included.
Similar ambiguity problems are likely in other cross-modal medical imaging tasks where the underlying biological processes evolve on different timescales.
Future models could benefit from mechanisms that detect or flag ambiguous inputs instead of always producing a single output.
Clinical use of synthesized PET images would need careful validation across populations with varying degrees of process decoupling.

Load-bearing premise

That grouping the data by amyloid and neurodegeneration status cleanly separates unambiguous mappings without introducing selection bias or hidden confounders.

What would settle it

Demonstrating high and stable synthesis accuracy on mixed ambiguous datasets using only MRI inputs without any multimodal additions.

Figures

Figures reproduced from arXiv: 2605.11867 by David M. Cash, Hojjat Azadbakht, Hui Zhang, Louise E. G. Baron, Philip S. J. Weston, Ross Callaghan.

**Figure 1.** Figure 1: Distribution of amyloid (A) and neurodegeneration (N) profiles in ADNI subjects. Comparable patterns of structural neurodegeneration occur across different amyloid states, indicating that MRI-derived signals may not uniquely determine amyloid burden. Amyloid status was defined using a cortical SUVR threshold of 1.11 [10], and neurodegeneration was quantified using AVRA-derived atrophy scores [11]. In our… view at source ↗

**Figure 2.** Figure 2: Amyloid load prediction across the three training regimes from Experiment I (Baseline, Concordant, and Discordant) and the plasma-conditioned model from Experiment II, using the pix2pix architecture. Columns correspond to model variants, and rows show evaluation on the full test set (top), concordant subset (middle), and discordant subset (bottom). The same ambiguity-dependent pattern is observed with lat… view at source ↗

read the original abstract

Structural MRI-to-amyloid PET synthesis has been proposed as a non-invasive alternative for amyloid assessment in Alzheimer's disease (AD). However, reported performance of identical models varies widely across studies, and increasingly complex architectures have not led to consistent gains. This inconsistency is thought to be caused by a fundamental biological ambiguity: MRI captures neurodegeneration, while PET measures amyloid pathology - two processes that are often temporally decoupled in AD. As a result, similar MRI patterns may correspond to different amyloid states, creating ambiguous one-to-many mappings. MRI-to-amyloid PET synthesis may therefore be intrinsically ill-posed; however, this idea has yet to be tested scientifically. The aim of this work is to test this hypothesis through two controlled experiments. We first control the training distribution by stratifying paired MRI-PET data by amyloid and neurodegeneration status. Using two standard synthesis models under a controlled design, we show that biologically unambiguous mappings are learnable in isolation, but performance collapses when data ambiguity is introduced. This demonstrates that ambiguity in the data distribution, rather than architectural capacity, constrains performance. Second, we show that introducing orthogonal biological information in the form of plasma biomarkers resolves this ambiguity. When multimodal inputs are incorporated, performance improves and stability is restored. Together, these findings suggest that limited and inconsistent performance in MRI-to-amyloid PET synthesis is explained by intrinsic biological ambiguity, and that stable, meaningful progress requires multimodal integration rather than architectural complexity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows biological ambiguity from decoupled MRI and PET signals drives inconsistent synthesis results, tested via stratification that works in clean subsets but collapses when mixed, with plasma data helping.

read the letter

The main point is that this work tests whether inconsistent MRI-to-amyloid PET synthesis comes from biological ambiguity rather than weak models. They stratify paired data by amyloid and neurodegeneration status, train standard models on the unambiguous subsets, and get solid performance. Mixing the strata causes collapse, but adding plasma biomarkers restores it. This is a direct empirical check on the idea that one-to-many mappings are the core limit, which prior studies mostly noted without isolating it this way.

Referee Report

2 major / 2 minor

Summary. The paper claims that inconsistent performance in structural MRI-to-amyloid PET synthesis arises from intrinsic biological ambiguity caused by the temporal decoupling of neurodegeneration (MRI) and amyloid pathology (PET) in Alzheimer's disease, creating one-to-many mappings. It tests this via two controlled experiments: stratifying paired MRI-PET data by amyloid and neurodegeneration status to show learnable mappings in unambiguous strata but collapse when mixed, and demonstrating that multimodal plasma biomarkers resolve the ambiguity and restore performance. The conclusion is that progress requires multimodal integration rather than architectural complexity.

Significance. If the central claim holds, the work would be significant for shifting focus in neuroimaging synthesis from model complexity to addressing data ambiguity through multimodal inputs, potentially explaining cross-study inconsistencies and guiding more stable non-invasive amyloid assessment. The use of controlled stratification with standard models and the explicit test of the ambiguity hypothesis are strengths that provide a falsifiable framework.

major comments (2)

[First experiment] §3 (first experiment): Stratification by amyloid status relies on PET-derived labels that are the synthesis target, risking selection bias since subgroups almost certainly differ in size, demographics, disease stage, and comorbidities; without explicit matching, covariate adjustment, or ablation on stratification criteria, the performance collapse cannot be cleanly attributed to mapping ambiguity rather than these confounders.
[Results] Results (performance contrasts): The described experiments lack reported quantitative metrics (e.g., exact correlation coefficients, Dice scores, or p-values for strata comparisons) and statistical controls for confounders, weakening the load-bearing claim that ambiguity—not architecture or selection effects—explains the collapse.

minor comments (2)

[Abstract] Abstract: Does not name the two standard synthesis models or provide any numerical performance values, reducing clarity for readers assessing the magnitude of effects.
[Methods] Methods: Neurodegeneration status definition and any procedures for balancing subgroup sizes or demographics are not detailed, affecting reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments, which have helped us strengthen the manuscript. We address each major point below and have revised the paper to incorporate additional controls, quantitative reporting, and supporting analyses where feasible.

read point-by-point responses

Referee: [First experiment] §3 (first experiment): Stratification by amyloid status relies on PET-derived labels that are the synthesis target, risking selection bias since subgroups almost certainly differ in size, demographics, disease stage, and comorbidities; without explicit matching, covariate adjustment, or ablation on stratification criteria, the performance collapse cannot be cleanly attributed to mapping ambiguity rather than these confounders.

Authors: We acknowledge that defining strata using the PET target introduces potential confounders, as the resulting subgroups differ in size, demographics, disease stage, and comorbidities. Our core design isolates the effect of ambiguity by holding the model architecture and data source fixed while varying only the presence of mixed vs. unambiguous mappings. Nevertheless, to rule out selection effects more rigorously, the revised manuscript now includes: (1) full demographic and clinical tables for each stratum, (2) covariate-adjusted regression analyses, and (3) a matched ablation in which strata are balanced on age, sex, and clinical stage (where sample size permits). These additions support that the performance drop is attributable to the introduction of biologically ambiguous pairs rather than unbalanced confounders alone. revision: partial
Referee: [Results] Results (performance contrasts): The described experiments lack reported quantitative metrics (e.g., exact correlation coefficients, Dice scores, or p-values for strata comparisons) and statistical controls for confounders, weakening the load-bearing claim that ambiguity—not architecture or selection effects—explains the collapse.

Authors: We have added the requested quantitative detail. The revised results section now reports exact Pearson correlation coefficients, MAE, and SSIM values for every stratum and condition, together with p-values from paired statistical tests comparing unambiguous vs. mixed strata. As noted in the response to the first comment, we also include covariate-adjusted models and the matched ablation results. These metrics and controls directly quantify the performance collapse and its attribution to data ambiguity. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical experiments test hypothesis without self-referential derivations

full rationale

The paper advances its central claim through two controlled empirical experiments on paired MRI-PET data: stratifying by amyloid/neurodegeneration status to compare learnable mappings in unambiguous vs. mixed strata, and testing multimodal plasma biomarkers for performance recovery. No equations, fitted parameters renamed as predictions, or self-definitional constructs appear. The performance contrasts are measured outcomes on held-out data rather than quantities forced by construction from the stratification criteria or model inputs. No load-bearing self-citations, uniqueness theorems, or ansatzes are invoked; the hypothesis test is self-contained against external benchmarks of synthesis accuracy.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Central claim depends on the domain premise that neurodegeneration and amyloid pathology are temporally decoupled, with no free parameters, new entities, or ad-hoc constants introduced.

axioms (1)

domain assumption MRI captures neurodegeneration while PET measures amyloid pathology and these processes are often temporally decoupled in AD
Stated directly in the abstract as the source of one-to-many mappings.

pith-pipeline@v0.9.0 · 5581 in / 1152 out tokens · 42971 ms · 2026-05-13T07:17:28.735835+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

performance collapses when data ambiguity is introduced... introducing orthogonal biological information in the form of plasma biomarkers resolves this ambiguity
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

MRI does not uniquely determine amyloid PET... one-to-many mapping

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.