PhotIQA: A photoacoustic image data set with image quality ratings

Anna Breger; Carola-Bibiane Sch\"onlieb; Clemens Karner; Ian Selby; Janek Gr\"ohl; Jonathan Weir-McCall; Lara-Sophie Witt; Merle Duch\^ene; Thomas R Else; Tom Rix

arxiv: 2507.03478 · v2 · submitted 2025-07-04 · 📡 eess.IV · cs.CV

PhotIQA: A photoacoustic image data set with image quality ratings

Anna Breger , Janek Gr\"ohl , Clemens Karner , Thomas R Else , Ian Selby , Tom Rix , Lara-Sophie Witt , Merle Duch\^ene

show 2 more authors

Jonathan Weir-McCall Carola-Bibiane Sch\"onlieb

This is my paper

Pith reviewed 2026-05-19 06:23 UTC · model grok-4.3

classification 📡 eess.IV cs.CV

keywords photoacoustic imagingimage quality assessmentdatasetexpert ratingsfull-reference IQAmedical imagingPAI

0 comments

The pith

The PhotIQA dataset supplies 1134 photoacoustic images with expert ratings on five quality properties to benchmark image quality assessment methods for medical imaging.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Photoacoustic imaging solves two inverse problems and therefore produces both acoustic and optical artifacts that differ from those in natural images. Standard full-reference IQA measures developed on natural scenes therefore perform inconsistently on PAI reconstructions. The authors assembled and released PhotIQA, a set of 1134 images rated by five experts on five distinct quality properties in a full-reference protocol. A sympathetic reader cares because the dataset supplies the missing benchmark that lets researchers develop and validate IQA measures tailored to this multi-physics modality.

Core claim

The authors assembled and publicly released PhotIQA, a dataset consisting of 1134 photoacoustic images rated by five experts across five quality properties in a full-reference setting. The images and ratings are available on Zenodo to support development and testing of IQA measures for photoacoustic imaging and other applications where medical images require quality assessment.

What carries the argument

The PhotIQA dataset of photoacoustic images together with the five-expert ratings on five quality properties collected in a full-reference protocol.

If this is right

New IQA measures can be trained and tested directly against expert judgments on PAI data that contain both acoustic and optical artifacts.
The five-property rating scheme allows fine-grained evaluation rather than a single overall score.
Because the protocol is full-reference, the dataset can serve as ground truth for comparing reconstructed images to high-quality references.
Public release enables direct replication and extension by other groups working on medical image quality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the ratings correlate with downstream task performance, they could guide optimization of reconstruction algorithms beyond current visual inspection.
The same expert-rating approach could be applied to other hybrid modalities that combine optical and acoustic information.
Future studies might test whether IQA measures calibrated on PhotIQA transfer to images from different PAI hardware or reconstruction methods.

Load-bearing premise

Expert ratings collected in the full-reference setting accurately capture the relevant quality properties of photoacoustic images and remain reproducible enough to serve as a reliable benchmark.

What would settle it

A second independent round of expert ratings on the same 1134 images that produces markedly different scores on the five quality properties would show the ratings are not stable enough to benchmark algorithms.

Figures

Figures reproduced from arXiv: 2507.03478 by Anna Breger, Carola-Bibiane Sch\"onlieb, Clemens Karner, Ian Selby, Janek Gr\"ohl, Jonathan Weir-McCall, Lara-Sophie Witt, Merle Duch\^ene, Thomas R Else, Tom Rix.

**Figure 1.** Figure 1: Two examples of images in PhotIQA, references (a) and the reconstructions from the described algorithms (b-d). Algorithm 1 (b) corrects a reconstructed PA image by using the light fluence obtained from simulations. Algorithms 2 and 3 (c-d) are deep-learning models trained to estimate the absorption coefficient. 4 [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗

**Figure 2.** Figure 2: The speedyIQA annotation app allows setting a task and rating cat [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Four examples of quality ratings from both annotators for the different detailed quality properties. (Top) reference image and (bottom) reconstructed, assessed image. • ”filename”: File name of the distorted image file • TASK+” 1”: Ratings from the first expert • TASK+” 2”: Ratings from the second expert 6 [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Box plot of the absolute differences (top) and the absolute differences of z-scores (bottom) of both raters with the median (green line) and mean (striped green line) [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: The distribution of the ratings given by both experts for all quality properties. regarding the intensity values, as it directly computes the pixel-wise difference. In line with previous experiments, besides HaarPSI [17], MS-SSIM [27], IWSSIM [24], LPIPS [26], and GMSD [25] show promising behaviors. 8 [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Four examples of rating disagreements corresponding to box plot outliers in [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

read the original abstract

Image quality assessment (IQA) is crucial in the evaluation stage of novel algorithms operating on images, including traditional and machine learning based methods. Due to the lack of available quality-rated medical images, most commonly used full-reference IQA measures have been developed and tested for natural images. Reported pitfalls and inconsistencies arising when applying such measures for medical images are not surprising, as they rely on different properties than natural images. In photoacoustic imaging (PAI), especially, standard benchmarking approaches for assessing the quality of image reconstructions are lacking. PAI is a multi-physics imaging modality, in which two inverse problems have to be solved, which makes the application of IQA measures uniquely challenging due to both, acoustic and optical, artifacts. To support the development and testing of IQA measures we assembled PhotIQA, a data set consisting of 1134 photoacoustic images. The images were rated by five experts across five quality properties in a full-reference setting, where the detailed rating enables usage beyond PAI. The data set with the images and corresponding ratings is publicly available on Zenodo.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PhotIQA releases a new set of 1134 rated photoacoustic images that fills a practical gap for IQA work in PAI, but the ratings lack any reported consistency checks.

read the letter

The main takeaway is that the authors collected and released PhotIQA, a public dataset of 1134 photoacoustic images rated by five experts on five quality properties in a full-reference setup. This directly targets the known mismatch between natural-image IQA metrics and the acoustic plus optical artifacts that show up in PAI reconstructions. The release on Zenodo makes the images and scores available for others to use when testing or training new measures, which is a concrete step forward for this subfield. The multi-property ratings also allow separate analysis of things like contrast or artifact visibility rather than a single overall score. That design choice is sensible and broadens potential use beyond just PAI. The paper does a straightforward job of stating the motivation and describing the collection process at a high level. What stands out as a soft spot is the absence of any inter-rater agreement numbers. With only five experts, even modest variance in how they score the same image could make the ratings less stable as a benchmark. The description covers the rating task and properties but does not include ICC values, raw score distributions, or notes on training or tie resolution. That leaves open the question of whether the scores capture reproducible signal or expert-specific noise. Image selection criteria and acquisition details are also light in the available text, which matters for judging how representative the set is. This work is aimed at researchers who build or evaluate IQA algorithms for medical or photoacoustic images and need a starting point with human ratings. Someone already working in PAI reconstruction would find the data immediately usable for comparison experiments. I would send it to peer review. The dataset release itself is worth referee time, and reviewers can ask for the missing agreement statistics and curation details without changing the core contribution.

Referee Report

1 major / 2 minor

Summary. The manuscript presents PhotIQA, a publicly released dataset of 1134 photoacoustic images with quality ratings from five experts on five properties in a full-reference setting. The goal is to provide a benchmark for developing and testing IQA measures specifically for photoacoustic imaging, which faces unique challenges from acoustic and optical artifacts not addressed by natural image IQA methods.

Significance. If the ratings are shown to be reliable, this dataset would be a valuable contribution to medical imaging research by filling the gap in quality-rated PAI data. It supports the evaluation of reconstruction algorithms and could improve the applicability of IQA in clinical and research settings for multi-physics modalities.

major comments (1)

[Methods (Rating Protocol)] Methods (Rating Protocol): No inter-rater reliability statistics (e.g., ICC, Fleiss' kappa, or percentage agreement) are reported for the five experts across the five quality properties. This is load-bearing for the central claim that the dataset forms a reproducible benchmark for IQA development, as the absence of such metrics leaves open whether the ratings reflect consistent quality signal or idiosyncratic variance.

minor comments (2)

[Abstract] Abstract: The five quality properties are referenced but not named; explicitly listing them would immediately clarify the dataset's scope for potential users.
[Dataset Curation] Dataset Curation: Additional detail on image selection criteria and any diversity or representativeness checks would strengthen reproducibility claims without altering the core contribution.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback. We agree that inter-rater reliability metrics are essential to support the dataset as a reproducible benchmark and will incorporate them in the revision.

read point-by-point responses

Referee: Methods (Rating Protocol): No inter-rater reliability statistics (e.g., ICC, Fleiss' kappa, or percentage agreement) are reported for the five experts across the five quality properties. This is load-bearing for the central claim that the dataset forms a reproducible benchmark for IQA development, as the absence of such metrics leaves open whether the ratings reflect consistent quality signal or idiosyncratic variance.

Authors: We agree that the absence of inter-rater reliability statistics weakens the claim of a reproducible benchmark. We have computed the intraclass correlation coefficient (ICC) using a two-way random-effects model for absolute agreement, along with Fleiss' kappa and percentage agreement, for each of the five quality properties. The ICC values range between 0.68 and 0.82, indicating moderate to good reliability. A new paragraph and table will be added to the Methods section reporting these statistics and the computation details. This directly addresses the concern and strengthens the manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity: dataset release with no derivations or fitted claims

full rationale

The paper's contribution is the assembly and public release of the PhotIQA dataset of 1134 images with expert ratings on five quality properties. No equations, derivations, predictions, or fitted parameters are present in the abstract or described content. The central claim is an empirical resource release that does not reduce to any self-referential construction, fitted input renamed as prediction, or self-citation load-bearing argument. Expert ratings are collected and described without quantitative claims that could be circular by construction. This is a standard non-circular data paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper is a data-release contribution with no mathematical derivations, physical models, or fitted parameters; the only implicit premises are standard assumptions about expert rating reliability and image representativeness for the PAI domain.

pith-pipeline@v0.9.0 · 5762 in / 1114 out tokens · 49248 ms · 2026-05-19T06:23:19.759953+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

To support the development and testing of IQA measures we assembled PhotIQA, a data set consisting of 1134 photoacoustic images. The images were rated by five experts across five quality properties in a full-reference setting
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

baseline experiments show that HaarPSI med significantly outperforms SSIM in correlating with the quality ratings (SRCC: 0.83 vs. 0.62)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages

[1]

Photoacoustics 32, 100539 (2023)

Assi, H., Cao, R., et al: A review of a strategic roadmapping ex- ercise to advance clinical translation of photoacoustic imaging: From current barriers to future adoption. Photoacoustics 32, 100539 (2023). https://doi.org/https://doi.org/10.1016/j.pacs.2023.100539

work page doi:10.1016/j.pacs.2023.100539 2023
[2]

IEEE Access 7, 140030–140070 (09 2019)

Athar, S., Wang, Z.: A comprehensive performance evaluation of image quality assessment algorithms. IEEE Access 7, 140030–140070 (09 2019). https://doi.org/10.1109/ACCESS.2019.2943319 10

work page doi:10.1109/access.2019.2943319 2019
[3]

Journal of Imaging Informatics in Medicine (2025)

Breger, A., Biguri, A., Landman, M.S., Selby, I., Amberg, N., Brunner, E., Gr¨ ohl, J., Hatamikia, S., Karner, C., Ning, L., Dittmer, S., Roberts, M., Sch¨ onlieb, C.B., Collaboration, A.C.: A study of why we need to reassess full reference image quality assessment with medical images. Journal of Imaging Informatics in Medicine (2025). https://doi.org/10....

work page doi:10.1007/s10278-025-01462-1 2025
[4]

In: Proceedings of 2024 International Conference on Medical Imaging and Computer-Aided Diagnosis (MICAD), Springer Lecture Notes in Electrical Engineering (2024)

Breger, A., Karner, C., Selby, I., Gr¨ ohl, J., Dittmer, S., Lilley, E., Babar, J., Beckford, J., Sadler, T.J., Shahipasand, S., Thavakumar, A., Roberts, M., Sch¨ onlieb, C.B.: A study on the adequacy of common iqa measures for medical images. In: Proceedings of 2024 International Conference on Medical Imaging and Computer-Aided Diagnosis (MICAD), Springe...

work page 2024
[5]

Magn Reson Imaging 34(6), 820–831 (Jul 2016)

Chow, L.S., Rajagopal, H., Paramesran, R.: Correlation between subjective and objective assessment of magnetic resonance (mr) images. Magn Reson Imaging 34(6), 820–831 (Jul 2016). https://doi.org/10.1016/j.mri.2016.03.006

work page doi:10.1016/j.mri.2016.03.006 2016
[6]

J Biomed Opt 17(6), 061202 (Jun 2012)

Cox, B., Laufer, J.G., Arridge, S.R., Beard, P.C.: Quantitative spectroscopic photoacoustic imaging: a review. J Biomed Opt 17(6), 061202 (Jun 2012). https://doi.org/10.1117/1.JBO.17.6.061202

work page doi:10.1117/1.jbo.17.6.061202 2012
[7]

medRxiv pp

Else, T.R., Loreno, C., Groves, A., Cox, B.T., Gr¨ ohl, J., Modolell, I., Bohndiek, S.E., Roshan, A.: The confounding effects of skin colour in photoacoustic imaging. medRxiv pp. 2025–03 (2025)

work page 2025
[8]

IEEE Trans Med Imaging PP (Nov 2023)

Gr¨ ohl, J., Else, T.R., Hacker, L., Bunce, E.V., Sweeney, P.W., Bohndiek, S.E.: Moving beyond simulation: data-driven quantitative photoacoustic imag- ing using tissue-mimicking phantoms. IEEE Trans Med Imaging PP (Nov 2023). https://doi.org/10.1109/TMI.2023.3331198

work page doi:10.1109/tmi.2023.3331198 2023
[9]

arXiv preprint arXiv:2505.24514 (2025)

Gr¨ ohl, J., Kunyansky, L., Poimala, J., Else, T.R., Di Cecio, F., Bohndiek, S.E., Cox, B.T., Hauptmann, A.: Digital twins enable full-reference quality assessment of photoacoustic image reconstructions. arXiv preprint arXiv:2505.24514 (2025)

work page arXiv 2025
[10]

Photoacoustics 22, 100241 (2021)

Gr¨ ohl, J., Schellenberg, M., Dreher, K., Maier-Hein, L.: Deep learning for biomed- ical photoacoustic imaging: A review. Photoacoustics 22, 100241 (2021)

work page 2021
[11]

Benchmarking transferability of self-supervised pretrain- ingformulti-organsegmentationondifferentmodalities

Karner, C., Gr¨ ohl, J., Selby, I., Babar, J., Beckford, J., Else, T.R., Sadler, T.J., Shahipasand, S., Thavakumar, A., Roberts, M., Rudd, J.H., Sch¨ onlieb, C.B., Weir-McCall, J.R., Breger, A.: Parameter choices in haarpsi for iqa with medical images. In: 2025 IEEE 22nd International Symposium on Biomedical Imaging (ISBI). pp. 1–5 (2025). https://doi.org...

work page doi:10.1109/isbi60581.2025.10981227 2025
[12]

IEEE Access 11, 14154–14168 (2023)

Kastryulin, S., Zakirov, J., Pezzotti, N., Dylov, D.V.: Image quality assess- ment for magnetic resonance imaging. IEEE Access 11, 14154–14168 (2023). https://doi.org/10.1109/ACCESS.2023.3243466

work page doi:10.1109/access.2023.3243466 2023
[13]

Medical Image Analysis 99, 103343 (2025)

Lee, W., Wagner, F., Galdran, A., Shi, Y., Xia, W., Wang, G., Mou, X., Ahamed, M.A., Imran, A.A.Z., Oh, J.E., Kim, K., Baek, J.T., Lee, D., Hong, B., Tem- pelman, P., Lyu, D., Kuiper, A., van Blokland, L., Calisto, M.B., Hsieh, S., Han, M., Baek, J., Maier, A., Wang, A., Gold, G.E., Choi, J.H.: Low-dose com- puted tomography perceptual image quality asses...

work page doi:10.1016/j.media.2024.103343 2025
[14]

Journal of Digital Imaging 36(6), 2623–2634 (2023)

Ohashi, K., Nagatani, Y., Yoshigoe, M., Iwai, K., Tsuchiya, K., Hino, A., Kida, Y., Yamazaki, A., Ishida, T.: Applicability evaluation of full-reference image quality assessment methods for computed tomography images. Journal of Digital Imaging 36(6), 2623–2634 (2023). https://doi.org/10.1007/s10278-023-00875-0

work page doi:10.1007/s10278-023-00875-0 2023
[15]

Nature Reviews Bioengineering 3(3), 193–212 (Mar 2025)

Park, J., Choi, S., Knieling, F., Clingman, B., Bohndiek, S., Wang, L.V., Kim, C.: Clinical translation of photoacoustic imaging. Nature Reviews Bioengineering 3(3), 193–212 (Mar 2025). https://doi.org/10.1038/s44222-024-00240-y

work page doi:10.1038/s44222-024-00240-y 2025
[16]

Pickering, J.W., Prahl, S.A., van Wieringen, N., Beek, J.F., Sterenborg, H.J.C.M., van Gemert, M.J.C.: Double-integrating-sphere system for measur- ing the optical properties of tissue. Appl. Opt. 32(4), 399–410 (Feb 1993). https://doi.org/10.1364/AO.32.000399

work page doi:10.1364/ao.32.000399 1993
[17]

Reisenhofer, S

Reisenhofer, R., Bosse, S., Kutyniok, G., Wiegand, T.: A haar wavelet-based perceptual similarity index for image quality assessment. Signal Process. Image Commun. 61, 33–43 (2018). https://doi.org/10.1016/j.image.2017.11.001

work page doi:10.1016/j.image.2017.11.001 2018
[18]

J Med Imaging (Bellingham) 4(3), 035501 (Jul 2017)

Renieblas, G.P., Nogu´ es, A.T., Gonz´ alez, A.M., G´ omez-Leon, N., Del Castillo, E.G.: Structural similarity index family for image quality assessment in radiological images. J Med Imaging (Bellingham) 4(3), 035501 (Jul 2017). https://doi.org/10.1117/1.JMI.4.3.035501

work page doi:10.1117/1.jmi.4.3.035501 2017
[19]

arXiv preprint arXiv:2504.12772 (2025)

Rietberg, M.T., Gr¨ ohl, J., Else, T.R., Bohndiek, S.E., Manohar, S., Cox, B.T.: Artifacts in photoacoustic imaging: Origins and mitigations. arXiv preprint arXiv:2504.12772 (2025)

work page arXiv 2025
[20]

Selby, I.: Github repository speedyiqa (March 2024), https://github.com/ selbs/speedy_iqa

work page 2024
[21]

IEEE Transactions on Image Processing 15(11), 3440–3451 (2006)

Sheikh, H., Sabir, M., Bovik, A.: A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Transactions on Image Processing 15(11), 3440–3451 (2006). https://doi.org/10.1109/TIP.2006.881959

work page doi:10.1109/tip.2006.881959 2006
[23]

Wang, A.C

Wang, Z., Bovik, A., Sheikh, H., Simoncelli, E.: Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing 13(4), 600–612 (2004). https://doi.org/10.1109/TIP.2003.819861

work page doi:10.1109/tip.2003.819861 2004
[24]

IEEE Transactions on Image Processing 20(5), 1185–1198 (2011)

Wang, Z., Li, Q.: Information content weighting for perceptual image quality assessment. IEEE Transactions on Image Processing 20(5), 1185–1198 (2011). https://doi.org/10.1109/TIP.2010.2092435

work page doi:10.1109/tip.2010.2092435 2011
[25]

IEEE Transactions on Im- age Processing 23(2), 684–695 (2014)

Xue, W., Zhang, L., Mou, X., Bovik, A.C.: Gradient magnitude similarity devia- tion: A highly efficient perceptual image quality index. IEEE Transactions on Im- age Processing 23(2), 684–695 (2014). https://doi.org/10.1109/TIP.2013.2293423

work page doi:10.1109/tip.2013.2293423 2014
[26]

Zhang, P

Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreason- able effectiveness of deep features as a perceptual metric. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 586–595 (2018). https://doi.org/10.1109/CVPR.2018.00068

work page doi:10.1109/cvpr.2018.00068 2018
[27]

In: Proceedings of the 37th IEEE Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA (2003) 12

Zhou Wang, E.P.S., Bovik, A.C.: Multi-scale structural similarity for image qual- ity assessment. In: Proceedings of the 37th IEEE Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA (2003) 12

work page 2003

[1] [1]

Photoacoustics 32, 100539 (2023)

Assi, H., Cao, R., et al: A review of a strategic roadmapping ex- ercise to advance clinical translation of photoacoustic imaging: From current barriers to future adoption. Photoacoustics 32, 100539 (2023). https://doi.org/https://doi.org/10.1016/j.pacs.2023.100539

work page doi:10.1016/j.pacs.2023.100539 2023

[2] [2]

IEEE Access 7, 140030–140070 (09 2019)

Athar, S., Wang, Z.: A comprehensive performance evaluation of image quality assessment algorithms. IEEE Access 7, 140030–140070 (09 2019). https://doi.org/10.1109/ACCESS.2019.2943319 10

work page doi:10.1109/access.2019.2943319 2019

[3] [3]

Journal of Imaging Informatics in Medicine (2025)

Breger, A., Biguri, A., Landman, M.S., Selby, I., Amberg, N., Brunner, E., Gr¨ ohl, J., Hatamikia, S., Karner, C., Ning, L., Dittmer, S., Roberts, M., Sch¨ onlieb, C.B., Collaboration, A.C.: A study of why we need to reassess full reference image quality assessment with medical images. Journal of Imaging Informatics in Medicine (2025). https://doi.org/10....

work page doi:10.1007/s10278-025-01462-1 2025

[4] [4]

In: Proceedings of 2024 International Conference on Medical Imaging and Computer-Aided Diagnosis (MICAD), Springer Lecture Notes in Electrical Engineering (2024)

Breger, A., Karner, C., Selby, I., Gr¨ ohl, J., Dittmer, S., Lilley, E., Babar, J., Beckford, J., Sadler, T.J., Shahipasand, S., Thavakumar, A., Roberts, M., Sch¨ onlieb, C.B.: A study on the adequacy of common iqa measures for medical images. In: Proceedings of 2024 International Conference on Medical Imaging and Computer-Aided Diagnosis (MICAD), Springe...

work page 2024

[5] [5]

Magn Reson Imaging 34(6), 820–831 (Jul 2016)

Chow, L.S., Rajagopal, H., Paramesran, R.: Correlation between subjective and objective assessment of magnetic resonance (mr) images. Magn Reson Imaging 34(6), 820–831 (Jul 2016). https://doi.org/10.1016/j.mri.2016.03.006

work page doi:10.1016/j.mri.2016.03.006 2016

[6] [6]

J Biomed Opt 17(6), 061202 (Jun 2012)

Cox, B., Laufer, J.G., Arridge, S.R., Beard, P.C.: Quantitative spectroscopic photoacoustic imaging: a review. J Biomed Opt 17(6), 061202 (Jun 2012). https://doi.org/10.1117/1.JBO.17.6.061202

work page doi:10.1117/1.jbo.17.6.061202 2012

[7] [7]

medRxiv pp

Else, T.R., Loreno, C., Groves, A., Cox, B.T., Gr¨ ohl, J., Modolell, I., Bohndiek, S.E., Roshan, A.: The confounding effects of skin colour in photoacoustic imaging. medRxiv pp. 2025–03 (2025)

work page 2025

[8] [8]

IEEE Trans Med Imaging PP (Nov 2023)

Gr¨ ohl, J., Else, T.R., Hacker, L., Bunce, E.V., Sweeney, P.W., Bohndiek, S.E.: Moving beyond simulation: data-driven quantitative photoacoustic imag- ing using tissue-mimicking phantoms. IEEE Trans Med Imaging PP (Nov 2023). https://doi.org/10.1109/TMI.2023.3331198

work page doi:10.1109/tmi.2023.3331198 2023

[9] [9]

arXiv preprint arXiv:2505.24514 (2025)

Gr¨ ohl, J., Kunyansky, L., Poimala, J., Else, T.R., Di Cecio, F., Bohndiek, S.E., Cox, B.T., Hauptmann, A.: Digital twins enable full-reference quality assessment of photoacoustic image reconstructions. arXiv preprint arXiv:2505.24514 (2025)

work page arXiv 2025

[10] [10]

Photoacoustics 22, 100241 (2021)

Gr¨ ohl, J., Schellenberg, M., Dreher, K., Maier-Hein, L.: Deep learning for biomed- ical photoacoustic imaging: A review. Photoacoustics 22, 100241 (2021)

work page 2021

[11] [11]

Benchmarking transferability of self-supervised pretrain- ingformulti-organsegmentationondifferentmodalities

Karner, C., Gr¨ ohl, J., Selby, I., Babar, J., Beckford, J., Else, T.R., Sadler, T.J., Shahipasand, S., Thavakumar, A., Roberts, M., Rudd, J.H., Sch¨ onlieb, C.B., Weir-McCall, J.R., Breger, A.: Parameter choices in haarpsi for iqa with medical images. In: 2025 IEEE 22nd International Symposium on Biomedical Imaging (ISBI). pp. 1–5 (2025). https://doi.org...

work page doi:10.1109/isbi60581.2025.10981227 2025

[12] [12]

IEEE Access 11, 14154–14168 (2023)

Kastryulin, S., Zakirov, J., Pezzotti, N., Dylov, D.V.: Image quality assess- ment for magnetic resonance imaging. IEEE Access 11, 14154–14168 (2023). https://doi.org/10.1109/ACCESS.2023.3243466

work page doi:10.1109/access.2023.3243466 2023

[13] [13]

Medical Image Analysis 99, 103343 (2025)

Lee, W., Wagner, F., Galdran, A., Shi, Y., Xia, W., Wang, G., Mou, X., Ahamed, M.A., Imran, A.A.Z., Oh, J.E., Kim, K., Baek, J.T., Lee, D., Hong, B., Tem- pelman, P., Lyu, D., Kuiper, A., van Blokland, L., Calisto, M.B., Hsieh, S., Han, M., Baek, J., Maier, A., Wang, A., Gold, G.E., Choi, J.H.: Low-dose com- puted tomography perceptual image quality asses...

work page doi:10.1016/j.media.2024.103343 2025

[14] [14]

Journal of Digital Imaging 36(6), 2623–2634 (2023)

Ohashi, K., Nagatani, Y., Yoshigoe, M., Iwai, K., Tsuchiya, K., Hino, A., Kida, Y., Yamazaki, A., Ishida, T.: Applicability evaluation of full-reference image quality assessment methods for computed tomography images. Journal of Digital Imaging 36(6), 2623–2634 (2023). https://doi.org/10.1007/s10278-023-00875-0

work page doi:10.1007/s10278-023-00875-0 2023

[15] [15]

Nature Reviews Bioengineering 3(3), 193–212 (Mar 2025)

Park, J., Choi, S., Knieling, F., Clingman, B., Bohndiek, S., Wang, L.V., Kim, C.: Clinical translation of photoacoustic imaging. Nature Reviews Bioengineering 3(3), 193–212 (Mar 2025). https://doi.org/10.1038/s44222-024-00240-y

work page doi:10.1038/s44222-024-00240-y 2025

[16] [16]

Pickering, J.W., Prahl, S.A., van Wieringen, N., Beek, J.F., Sterenborg, H.J.C.M., van Gemert, M.J.C.: Double-integrating-sphere system for measur- ing the optical properties of tissue. Appl. Opt. 32(4), 399–410 (Feb 1993). https://doi.org/10.1364/AO.32.000399

work page doi:10.1364/ao.32.000399 1993

[17] [17]

Reisenhofer, S

Reisenhofer, R., Bosse, S., Kutyniok, G., Wiegand, T.: A haar wavelet-based perceptual similarity index for image quality assessment. Signal Process. Image Commun. 61, 33–43 (2018). https://doi.org/10.1016/j.image.2017.11.001

work page doi:10.1016/j.image.2017.11.001 2018

[18] [18]

J Med Imaging (Bellingham) 4(3), 035501 (Jul 2017)

Renieblas, G.P., Nogu´ es, A.T., Gonz´ alez, A.M., G´ omez-Leon, N., Del Castillo, E.G.: Structural similarity index family for image quality assessment in radiological images. J Med Imaging (Bellingham) 4(3), 035501 (Jul 2017). https://doi.org/10.1117/1.JMI.4.3.035501

work page doi:10.1117/1.jmi.4.3.035501 2017

[19] [19]

arXiv preprint arXiv:2504.12772 (2025)

Rietberg, M.T., Gr¨ ohl, J., Else, T.R., Bohndiek, S.E., Manohar, S., Cox, B.T.: Artifacts in photoacoustic imaging: Origins and mitigations. arXiv preprint arXiv:2504.12772 (2025)

work page arXiv 2025

[20] [20]

Selby, I.: Github repository speedyiqa (March 2024), https://github.com/ selbs/speedy_iqa

work page 2024

[21] [21]

IEEE Transactions on Image Processing 15(11), 3440–3451 (2006)

Sheikh, H., Sabir, M., Bovik, A.: A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Transactions on Image Processing 15(11), 3440–3451 (2006). https://doi.org/10.1109/TIP.2006.881959

work page doi:10.1109/tip.2006.881959 2006

[22] [23]

Wang, A.C

Wang, Z., Bovik, A., Sheikh, H., Simoncelli, E.: Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing 13(4), 600–612 (2004). https://doi.org/10.1109/TIP.2003.819861

work page doi:10.1109/tip.2003.819861 2004

[23] [24]

IEEE Transactions on Image Processing 20(5), 1185–1198 (2011)

Wang, Z., Li, Q.: Information content weighting for perceptual image quality assessment. IEEE Transactions on Image Processing 20(5), 1185–1198 (2011). https://doi.org/10.1109/TIP.2010.2092435

work page doi:10.1109/tip.2010.2092435 2011

[24] [25]

IEEE Transactions on Im- age Processing 23(2), 684–695 (2014)

Xue, W., Zhang, L., Mou, X., Bovik, A.C.: Gradient magnitude similarity devia- tion: A highly efficient perceptual image quality index. IEEE Transactions on Im- age Processing 23(2), 684–695 (2014). https://doi.org/10.1109/TIP.2013.2293423

work page doi:10.1109/tip.2013.2293423 2014

[25] [26]

Zhang, P

Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreason- able effectiveness of deep features as a perceptual metric. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 586–595 (2018). https://doi.org/10.1109/CVPR.2018.00068

work page doi:10.1109/cvpr.2018.00068 2018

[26] [27]

In: Proceedings of the 37th IEEE Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA (2003) 12

Zhou Wang, E.P.S., Bovik, A.C.: Multi-scale structural similarity for image qual- ity assessment. In: Proceedings of the 37th IEEE Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA (2003) 12

work page 2003