pith. sign in

arxiv: 2606.21368 · v1 · pith:5JEN3SLDnew · submitted 2026-06-19 · 💻 cs.CV · cs.AI· cs.LG

Graph-of-Differences: Anatomy-Structured Difference Alignment for Medical Image Re-Identification

Pith reviewed 2026-06-26 14:34 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.LG
keywords medical image re-identificationanatomy graphsdifference alignmentinterpretabilityfundus imageschest X-raygeneralization
0
0 comments X

The pith

Representing medical images as anatomy graphs and aligning differences over matched nodes improves re-identification accuracy and generalization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Graph-of-Differences to ground comparisons in explicit anatomical structures for medical image re-identification. Each image becomes an anatomy graph with nodes for named regions, soft correspondences are found between pairs, and differences are computed on matched parts. An alignment objective connects these local differences to the overall image difference. This setup is meant to avoid shortcut learning and enable explanations based on specific anatomy instead of pixels. Readers would care if it leads to more accurate and trustworthy patient record linking across scans.

Core claim

GoD represents each image as an anatomy graph, establishes soft node correspondence for image pairs, computes differences over matched anatomy, and uses a graph-level difference alignment objective to tie these to the global backbone difference, ensuring the retrieval signal is anchored in homologous structures.

What carries the argument

The anatomy graph with soft node correspondence and graph-level difference alignment objective that anchors differences to named structures.

If this is right

  • Rank-1 accuracy increases by 7.1 percentage points on fundus images and 3.1 on CXR over baseline.
  • Gains extend to zero-shot external transfers, indicating better generalization.
  • Explanations become verifiable through node insertion and deletion tests on named graph nodes.
  • The method reduces vulnerability to shortcut learning from non-anatomical features.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach may apply to other tasks requiring anatomical consistency, such as segmentation or registration.
  • Structured explanations could support regulatory requirements for AI in healthcare.
  • Performance might further improve if node correspondences incorporate domain-specific priors.

Load-bearing premise

Soft node correspondence between anatomy graphs from different images can be established reliably enough that the resulting differences are meaningful and not dominated by correspondence errors.

What would settle it

If ablating the anatomy graph and alignment components yields no improvement or worse performance than the frozen-backbone baseline, or if correspondence errors lead to meaningless differences, the central claim would be falsified.

Figures

Figures reproduced from arXiv: 2606.21368 by Abhijit Das, Dwarikanath Mahapatra, Imran Razzak, Nichula Wasalathilaka.

Figure 1
Figure 1. Figure 1: CXR anatomy graph construction. (a) Input radiograph. (b) Nodes sampled from CheXmask masks: left lung boundary, left interior, right lung boundary, right interior, heart boundary, heart interior. (c) Graph G(x) = (V, E, H) with k-NN (k = 6) and lung-symmetry edges; node features pooled from the frozen map F(x) (Eq. 1). The gap accuracy cannot close. State-of-the-art MedReID (e.g., MaMI [16]) aligns inter-… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of Graph-of-Differences (GoD). A frozen MaMI encoder (ViT-B + ComPA) yields a global descriptor g(x) and feature map F(x). The Anatomy Graph Constructor (AGC) pools node features from F(x) (Eq. 1) and builds graphs with k-NN and symmetry edges. A shared Graph Encoder yields gG(x), fused with g(x) into the embedding z(x). Per pair, soft correspondence computes anatomy-matched differences ∆G, aligne… view at source ↗
Figure 3
Figure 3. Figure 3: Anatomy-aware retrieval explanations. Each row: one query and top-4 retrievals; green / red borders = correct / incorrect. Successful retrievals concentrate attribution on stable structures; failures are diffuse. failure cases show diffuse or peripheral attributions, consistent with confounders such as exposure shifts or device artefacts. Because attributions are over named nodes, a clinician can directly … view at source ↗
Figure 4
Figure 4. Figure 4: Node faithfulness audit. Removing top-m attributed nodes degrades R1 more than random deletion (CXR: ∆AUCR1 = 0.0195, CI [0.0046, 0.0342]; Fundus: 0.0223, CI [0.0107, 0.0348]); insertion restores R1 faster at small m. This confirms attributed nodes are causally relied upon, not spurious gradients [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
read the original abstract

Medical image re-identification (MedReID) enables longitudinal patient linkage but remains vulnerable to shortcut learning and often produces decisions that clinicians cannot audit against named anatomy. We propose Graph-of-Differences (GoD), which grounds identity comparisons in explicit anatomical structure. Each image is represented as an anatomy graph whose nodes correspond to named anatomical regions; given an image pair, soft node correspondence is established, and differences are computed over matched anatomy. A graph-level difference alignment objective ties these anatomy-matched differences to the global backbone difference, ensuring the retrieval signal is anchored in homologous structures rather than arbitrary spatial tokens. Explanations are defined over named graph nodes and quantitatively audited via node insertion/deletion tests, replacing unstable pixel heatmaps with verifiable structure-level evidence. On internal benchmarks, GoD improves Rank-1 by +7.1 pp on fundus and +3.1 pp on CXR over a strong frozen-backbone baseline, with further gains on zero-shot external transfers confirming that anatomy grounding improves both accuracy and generalization. Code is available at https://github.com/GenMI-Lab/GoD.git.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes Graph-of-Differences (GoD) for medical image re-identification (MedReID). Images are represented as anatomy graphs with nodes for named anatomical regions. For pairs, soft node correspondence is established and differences computed over matched nodes; a graph-level alignment objective ties these anatomy-specific differences to the global backbone difference. This is claimed to reduce shortcut learning, improve Rank-1 by +7.1 pp (fundus) and +3.1 pp (CXR) over a frozen-backbone baseline, yield further zero-shot external gains, and enable auditable explanations via named nodes with insertion/deletion tests. Code is released.

Significance. If the results hold after addressing validation gaps, the contribution would be significant for MedReID by replacing opaque spatial-token comparisons with explicit, named anatomical structure. The reported accuracy and generalization improvements, combined with structure-level explanations and public code, would advance both performance and clinical auditability in longitudinal patient linkage tasks.

major comments (2)
  1. [Method] Method section (description of soft node correspondence and graph-level alignment): the central claim attributes performance gains to anatomy-grounded differences, yet no quantitative validation of correspondence quality (e.g., precision/recall on held-out annotated node pairs) or ablation replacing learned correspondence with random/uniform matching is provided; without these, it is impossible to confirm that reported improvements arise from meaningful anatomical variation rather than the auxiliary alignment loss or backbone features.
  2. [Experiments] Experiments section (internal benchmarks and zero-shot transfers): the +7.1 pp and +3.1 pp Rank-1 gains and external-transfer results are presented without error bars, statistical significance tests, or explicit baseline definitions; this weakens the ability to assess whether the anatomy-grounding component is the load-bearing driver of the claimed generalization benefit.
minor comments (2)
  1. [Abstract] Abstract and Experiments: the description of the frozen-backbone baseline and the temperature parameter in soft correspondence lack implementation specifics that would aid reproducibility.
  2. [Explanations] Explanations section: while node insertion/deletion tests are mentioned, the quantitative auditing protocol (e.g., how many nodes, how deletion is performed) should be detailed with pseudocode or equations for clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful comments, which highlight important aspects for validating the method and strengthening the experimental presentation. We provide point-by-point responses below and will make revisions to the manuscript as indicated.

read point-by-point responses
  1. Referee: [Method] Method section (description of soft node correspondence and graph-level alignment): the central claim attributes performance gains to anatomy-grounded differences, yet no quantitative validation of correspondence quality (e.g., precision/recall on held-out annotated node pairs) or ablation replacing learned correspondence with random/uniform matching is provided; without these, it is impossible to confirm that reported improvements arise from meaningful anatomical variation rather than the auxiliary alignment loss or backbone features.

    Authors: We agree that additional validation of the soft node correspondence would be beneficial. While the manuscript uses insertion/deletion tests on named nodes to audit explanations, we did not include direct metrics like precision/recall on annotated correspondences or an ablation with random matching. In the revised version, we will include an ablation study comparing the learned correspondence against random and uniform matching baselines, reporting the impact on Rank-1 accuracy. This will help confirm that the gains stem from the anatomy-structured differences. Note that creating held-out annotated node pairs would require new annotations not present in the original datasets, so we focus on the ablation instead. revision: yes

  2. Referee: [Experiments] Experiments section (internal benchmarks and zero-shot transfers): the +7.1 pp and +3.1 pp Rank-1 gains and external-transfer results are presented without error bars, statistical significance tests, or explicit baseline definitions; this weakens the ability to assess whether the anatomy-grounding component is the load-bearing driver of the claimed generalization benefit.

    Authors: The current manuscript presents the gains without error bars or significance tests, which is a valid observation. We will revise the experiments section to include results averaged over multiple random seeds with standard deviations (error bars), perform statistical significance tests (e.g., t-tests) on the improvements, and provide explicit definitions of the baselines in the text, tables, and captions. This will better demonstrate the contribution of the anatomy-grounding component. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper defines GoD via anatomy graphs, soft node correspondence, per-node differences, and a graph-level alignment loss that anchors to backbone features; these are design choices and training objectives, not self-definitions or fitted inputs renamed as predictions. Reported Rank-1 gains (+7.1 pp fundus, +3.1 pp CXR) and zero-shot transfers are empirical measurements against a frozen-backbone baseline, not reductions by construction. No self-citation load-bearing steps, uniqueness theorems, or ansatz smuggling appear in the abstract or described method. The derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review prevents identification of specific fitted parameters or axioms; no explicit free parameters, axioms, or invented entities are stated.

pith-pipeline@v0.9.1-grok · 5742 in / 1050 out tokens · 20584 ms · 2026-06-26T14:34:43.615366+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

20 extracted references · 9 canonical work pages

  1. [1]

    Peking university international competition on ocular disease intelligent recogni- tion (odir-2019).https://odir2019.grand-challenge.org/(2019), grand Chal- lenge dataset and competition on ocular disease classification

  2. [2]

    Image Analysis & Stereology33(3), 231–234 (August 2014).https://doi.org/10.5566/ias.1155

    Decencière, E., Zhang, X., Cazuguel, G., Lay, B., Cochener, B., Trone, C., Gain, P., Ordonez, R., Massin, P., Erginay, A., Charton, B., Klein, J.C.: Feedback on a publicly distributed image database: The MESSIDOR database. Image Analysis & Stereology33(3), 231–234 (August 2014).https://doi.org/10.5566/ias.1155

  3. [3]

    In: International Conference on Learning Representations (ICLR) (2021)

    Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An image is worth 16x16 words: Transformers for image recognition at scale. In: International Conference on Learning Representations (ICLR) (2021)

  4. [4]

    PhysioNet (January 2025).https://doi.org/10.13026/ 3705-zg36, version 1.0.0

    Gaggion, N., Mosquera, C., Aineseder, M., Mansilla, L., Milone, D., Ferrante, E.: Chexmask database: A large-scale dataset of anatomical segmentation masks for chest x-ray images. PhysioNet (January 2025).https://doi.org/10.13026/ 3705-zg36, version 1.0.0

  5. [5]

    Medical Image Analysis99, 103335 (2025).https: //doi.org/10.1016/j.media.2024.103335

    Ganz, J., Ammeling, J., Jabari, S., Breininger, K., Aubreville, M.: Re-identification from histopathology images. Medical Image Analysis99, 103335 (2025).https: //doi.org/10.1016/j.media.2024.103335

  6. [6]

    Nature Machine Intelligence , author =

    Geirhos, R., Jacobsen, J.H., Michaelis, C., Zemel, R., Brendel, W., Bethge, M., Wichmann, F.A.: Shortcut learning in deep neural networks. Nature Machine Intel- ligence2(11), 665–673 (2020).https://doi.org/10.1038/s42256-020-00257-z

  7. [7]

    European Radiology35(5), 2422–2433 (2024).https://doi.org/10.1007/ s00330-024-11013-x

    Heinrich, A.: Automatic personal identification using a single ct image. European Radiology35(5), 2422–2433 (2024).https://doi.org/10.1007/ s00330-024-11013-x

  8. [8]

    arXiv preprint arXiv:1703.07737 (2017).https://doi.org/10

    Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017).https://doi.org/10. 48550/arXiv.1703.07737,https://arxiv.org/abs/1703.07737

  9. [9]

    CoRRabs/1802.04712(2018),http://arxiv.org/abs/1802.04712

    Ilse, M., Tomczak, J.M., Welling, M.: Attention-based deep multiple instance learn- ing. CoRRabs/1802.04712(2018),http://arxiv.org/abs/1802.04712

  10. [10]

    OpenReview

    Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., Marklund, H., Haghgoo, B., Ball, R., Shpanskaya, K., Seekins, J., Mong, D.A., Halabi, S.S., Sand- berg, J.K., Jones, R., Larson, D.B., Langlotz, C.P., Patel, B.N., Lungren, M.P., Ng, 10 Wasalathilaka et al. A.Y.: Chexpert: A large chest radiograph dataset with uncertainty labels and ex...

  11. [11]

    Radiology: Artificial Intelligence5(6) (2023)

    Macpherson, M.S., Hutchinson, C.E., Horst, C., Goh, V., Montana, G.: Pa- tient reidentification from chest radiographs: An interpretable deep metric learn- ing approach and its applications. Radiology: Artificial Intelligence5(6) (2023). https://doi.org/10.1148/ryai.230019

  12. [12]

    In: 2024 37th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)

    Manesco, J.R.R., Jodas, D., Zanella, M.J.G., Santos, M.K., Papa, J.P.: Graph fea- ture embeddings for patient re-identification from chest x-ray images. In: 2024 37th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI). pp. 1–6. IEEE (2024)

  13. [13]

    From Anecdotal Evidence to Quantitative Evaluation Methods: A Systematic Review on Evaluating Explainable AI , volume=

    Nauta, M., Trienes, J., Pathak, S., Nguyen, E., Peters, M., Schmitt, Y., Schlötterer, J., van Keulen, M., Seifert, C.: From anecdotal evidence to quantitative evalua- tion methods: A systematic review on evaluating explainable ai. ACM Computing Surveys55(13s), 1–42 (2023).https://doi.org/10.1145/3583558

  14. [14]

    Scientific Reports12(1) (2022).https://doi.org/10

    Packhäuser, K., Gündel, S., Münster, N., Syben, C., Christlein, V., Maier, A.: Deep learning-based patient re-identification is able to exploit the biometric nature of medical chest x-ray data. Scientific Reports12(1) (2022).https://doi.org/10. 1038/s41598-022-19045-3

  15. [15]

    In: Advances in Neural Information Processing Systems (NeurIPS)

    Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: Deep hierarchical feature learn- ing on point sets in a metric space. In: Advances in Neural Information Processing Systems (NeurIPS). vol. 30 (2017)

  16. [16]

    In: IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025

    Tian, Y., Ji, K., Zhang, R., Jiang, Y., Li, C., Wang, X., Zhai, G.: Towards all-in- one medical image re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 30774–30786 (2025). https://doi.org/10.1109/CVPR52734.2025.02866

  17. [17]

    In: Advances in Neural Information Processing Systems (NeurIPS)

    Vaswani,A.,Shazeer,N.,Parmar,N.,Uszkoreit,J.,Jones,L.,Gomez,A.N.,Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems (NeurIPS). vol. 30 (2017)

  18. [18]

    In: International Conference on Learning Representations (ICLR) (2018)

    Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., Bengio, Y.: Graph attention networks. In: International Conference on Learning Representations (ICLR) (2018)

  19. [19]

    URL http://dx.doi.org/ 10.1109/CVPR.2017.369

    Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.M.: Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classi- fication and localization of common thorax diseases. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2097–2106 (2017).https://doi.org/10.1109/CVPR.2017.369

  20. [20]

    Sarma, Michael M

    Wang, Y., Sun, Y., Liu, Z., Sarma, S., Bronstein, M., Solomon, J.: Dynamic graph CNN for learning on point clouds. ACM Transactions on Graphics38(5), 146:1– 146:12 (January 2018).https://doi.org/10.1145/3326362