Developing an App to interpret Chest X-rays to support the diagnosis of respiratory pathology with Artificial Intelligence

Andrew Elkins; Felipe F. Freitas; Veronica Sanz

arxiv: 1906.11282 · v1 · pith:M74DSSAWnew · submitted 2019-06-26 · 💻 cs.CV · physics.med-ph

Developing an App to interpret Chest X-rays to support the diagnosis of respiratory pathology with Artificial Intelligence

Andrew Elkins , Felipe F. Freitas , Veronica Sanz This is my paper

Pith reviewed 2026-05-25 15:26 UTC · model grok-4.3

classification 💻 cs.CV physics.med-ph

keywords smartphone appartificial neural networkchest X-rayrespiratory pathologymobile deploymentmedical diagnosismachine learning

0 comments

The pith

A smartphone app using an artificial neural network is developed to interpret chest X-rays for respiratory diagnosis.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to improve access to diagnosis in remote areas by developing machine learning methods that run on mobile devices for analyzing X-ray images. It describes the creation of a smartphone application powered by an artificial neural network to assist physicians in identifying life-threatening respiratory conditions. The work leverages portable AI environments to make this possible without relying on traditional medical infrastructure. A sympathetic reader would see this as a practical step toward bringing image-based diagnostics to underserved locations via everyday hardware.

Core claim

The authors develop new machine learning methodologies for mobile deployment and present a smartphone app that uses an artificial neural network to interpret chest X-ray images, thereby assisting physicians with the early diagnosis of respiratory pathologies in regions where good quality medical services may be lacking.

What carries the argument

The smartphone app incorporating an artificial neural network for on-device analysis of chest X-ray images.

If this is right

Physicians gain assistance in diagnosing respiratory conditions from X-ray images without needing advanced on-site equipment.
Early detection of life-threatening conditions becomes feasible in remote areas through portable devices.
Machine learning models for medical imaging can be adapted to fast and portable environments on smartphones.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could allow initial screening by non-radiologists before specialist review.
Linking the app to cloud services might allow updates to the model without full redeployment.
Broader testing across varied X-ray equipment and patient populations would clarify real-world limits.

Load-bearing premise

The neural network model can be deployed effectively on mobile devices and provide useful diagnostic assistance.

What would settle it

A direct test of the completed app on standard smartphones using a set of chest X-rays with known diagnoses, measuring whether its outputs match expert interpretations at a usable rate.

Figures

Figures reproduced from arXiv: 1906.11282 by Andrew Elkins, Felipe F. Freitas, Veronica Sanz.

**Figure 1.** Figure 1: Random transformations applied in one selected image. The random rotation and change in brightness are applied to consider the variations of image quality and arrangement one can encounter in medical facilities in remote areas [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗

**Figure 2.** Figure 2: Extract of the label dataframe for the multi-label classification task. Fast.AI can handle multilabel classification targets and image names are given in a proper structured way, which we provide a new csv file. The one-hot encode represents the presence (1) or absence (0) of a particular disease in the list: Atelectasis, Cardiomegaly, Effusion, Infiltration, Mass, Nodule, Pneumonia, Pneumothorax, Consoli… view at source ↗

**Figure 3.** Figure 3: Number of co-occurrences for each disease in the dataset. Note, for example, that Infiltration is always co-occurrent with some other disease label, and that some diseases often come together, e.g. Effusion and Atelectasis. In our loss function we use the weights from each class calculated according to the scikit-learn class weight method, defined as: nsamples (nclasses ∗ [n1, n2, ...]) (1.1) Where nsample… view at source ↗

**Figure 4.** Figure 4: DenseNet architectures for ImageNet. Note that each “conv” layer shown in the table corresponds the sequence: Batch Normalization, Rectifier Linear function and a Convolutional Layer (BN-ReLU-Conv) [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: Fit one cycle method. Left figure: this shows the variation of the learning rate over the number of iterations, with the vertical axis showing the learning rate change between 1e −4 to 3e −3 and the horizontal axis showing all the iterations (i.e. epochs). Right figure: here we show the change in momentum rate over number of iterations. In these figures we can see the change in values of the learning rate … view at source ↗

**Figure 6.** Figure 6: Effect of the different choices for the weight decay (WD) in the loss change (vertical axis) for a given learnnig rate (horizontal axis) for the DenseNet-121 CNN without class weights (left panel) and with class weights (right panel). Note that the loss is relatively insensitive to the choice of WD, which is a sign of stability of our model. With small values for the WD one can choose large values for the … view at source ↗

**Figure 7.** Figure 7: Precision-Recall (top) and ROC (bottom) curves for our model in the Pneumothorax vs All case. There are two common representations of the goodness of the algorithm. One is the precisionrecall curve, shown in [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗

**Figure 8.** Figure 8: Grad-CAM for a Pneumothorax item (left) and no Pneumothorax item (right) class. The regions in red show the areas which activate more units (neurons) in the last convolutional layer before the classification. The regions displayed in yellow to red colours highlight the spatial location of the features which activate more intensely the last convolutional layer before the classification. As one can see, in … view at source ↗

**Figure 9.** Figure 9: ) [PITH_FULL_IMAGE:figures/full_fig_p015_9.png] view at source ↗

**Figure 10.** Figure 10: ROC curve for each of 14 classes and their respective AUC. The higher AUC we get is for Cardiomegaly AUC=85%, while the lowest we get is for mass (AUC=56%), Pneumothorax we got AUC=74% we show the precision-recall curves for the average and for each class, respectively. The average precision (AP) summarizes the precision-recall plots as the weighted mean of precisions achieved at each threshold, with the… view at source ↗

**Figure 11.** Figure 11: Precision-Recall curve considering micro-average, i.e. an aggregate of the contributions from each class and average over all classes. In a multi-class classification, micro-average is preferable to macroaverage for class imbalance [PITH_FULL_IMAGE:figures/full_fig_p017_11.png] view at source ↗

**Figure 12.** Figure 12: Precision-Recall curves for each of the 14 classes and their respective areas. The areas are calculated using the average precision score considering the weights of each class. The iso-curves show the F1 scores in the Precision-Recall plane. – 16 – [PITH_FULL_IMAGE:figures/full_fig_p017_12.png] view at source ↗

**Figure 13.** Figure 13: Original image (left), Guided Grad-CAM (center) and precision-recall plot (right) for Atelectasis class. The Grad-CAM shows that the CNN is getting very hot spots (regions in red) in the left lung and at the bottom of the right lung. The PR curve for this class gives AP = 0.31 [PITH_FULL_IMAGE:figures/full_fig_p019_13.png] view at source ↗

**Figure 14.** Figure 14: Original image (left), Guided Grad-CAM (center) and precision-recall plot (right) for Cardiomegaly class. The Grad-CAM shows that the CNN is getting a very hot spot in the bottom left region. The PR curve for this class gives AP = 0.21 [PITH_FULL_IMAGE:figures/full_fig_p019_14.png] view at source ↗

**Figure 15.** Figure 15: Original image (left), Guided Grad-CAM (center) and precision-recall plot (right) for Effusion class. The Grad-CAM shows that the CNN is getting a very hot at the right lung. The PR curve for this class gives AP = 0.42. – 18 – [PITH_FULL_IMAGE:figures/full_fig_p019_15.png] view at source ↗

**Figure 16.** Figure 16: Original image (left), Guided Grad-CAM (center) and precision-recall plot (right) for Emphysema class. The Grad-CAM shows that the CNN is getting hot spots at the both lungs. The PR curve for this class gives AP = 0.07 [PITH_FULL_IMAGE:figures/full_fig_p020_16.png] view at source ↗

**Figure 17.** Figure 17: Original image (left), Guided Grad-CAM (center) and precision-recall plot (right) for Hernia class. The Grad-CAM shows that the CNN is getting a very hot region on both lungs. The PR curve for this class gives AP = 0.01 [PITH_FULL_IMAGE:figures/full_fig_p020_17.png] view at source ↗

**Figure 18.** Figure 18: Original image (left), Guided Grad-CAM (center) and precision-recall plot (right) for Infiltration class. The Grad-CAM shows that the CNN is getting a very hot region on both lungs, same as the Hernia case. The PR curve for this class gives AP = 0.46. – 19 – [PITH_FULL_IMAGE:figures/full_fig_p020_18.png] view at source ↗

**Figure 19.** Figure 19: original image (left), Guided Grad-CAM (center) and precision-recall plot (right) for Mass class. The Grad-CAM shows that the CNN is getting very hot region on the left lungs. The PR curve for this class gives AP = 0.13 [PITH_FULL_IMAGE:figures/full_fig_p021_19.png] view at source ↗

**Figure 20.** Figure 20: Original image (left), Guided Grad-CAM (center) and precision-recall plot (right) for Nodule class. The Grad-CAM shows that the CNN is getting a hot region on left and right lungs. The PR curve for this class gives AP = 0.13 [PITH_FULL_IMAGE:figures/full_fig_p021_20.png] view at source ↗

**Figure 21.** Figure 21: original image (left), Guided Grad-CAM (center) and precision-recall plot (right) for Pneumonia class. The Grad-CAM shows that the CNN is getting a hot region on the botom right lung and some warm places all arround lungs and heart. The PR curve for this class gives AP = 0.03. – 20 – [PITH_FULL_IMAGE:figures/full_fig_p021_21.png] view at source ↗

**Figure 22.** Figure 22: Original image (left), Guided Grad-CAM (center) and precision-recall plot (right) for Pneumothorax class. The Grad-CAM shows that the CNN is getting a hot region on both lungs. The PR curve for this class gives AP = 0.17. – 21 – [PITH_FULL_IMAGE:figures/full_fig_p022_22.png] view at source ↗

**Figure 23.** Figure 23: Top figures: PR and ROC for unweighted loss function. The ROC curve is insensitive to changes in the class distribution, while the PR curve can show the effects of imbalance class. Lower figures: The ROC curve degrades respect to the previous case (unweighted), whereas the PR shows a more stable behaviour when the recall is increased. The effects of class weights are noticeable in both metrics. We notice… view at source ↗

read the original abstract

In this paper we present our work to improve access to diagnosis in remote areas where good quality medical services may be lacking. We develop new Machine Learning methodologies for deployment onto mobile devices to help the early diagnosis of a number of life-threatening conditions using X-ray images. By using the latest developments in fast and portable Artificial Intelligence environments, we develop a smartphone app using an Artificial Neural Network to assist physicians in their diagnostic.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a high-level project description with no data, model details, or validation to support the diagnostic app claim.

read the letter

Hey, the core issue here is that the paper announces development of a smartphone app using an ANN for chest X-ray diagnosis but supplies none of the evidence needed to evaluate whether it works. It stays at the level of a project summary rather than a completed piece of research. The goal of improving access to diagnosis in remote areas is reasonable and the choice to target mobile deployment makes sense for that setting. Beyond the intent, though, nothing stands out as new. The text relies on existing neural network techniques without describing any custom architecture, training approach, or fresh insight into mobile medical imaging. The soft spots are substantial and central. There is no mention of the dataset source or size, the specific model, training procedure, or any performance numbers such as accuracy, sensitivity, or AUC on held-out images. No evidence is given that inference runs acceptably on phone hardware either. Without those elements the claim that the app assists physicians remains unsupported. A reader hunting for practical mobile health ideas might skim it for the high-level direction, but anyone needing reproducible methods or verifiable results will find little to use. I would not bring this to a reading group or cite it. It does not show the technical grounding or empirical content that would justify sending it to peer review.

Referee Report

2 major / 0 minor

Summary. The paper claims to develop a smartphone app using an Artificial Neural Network to interpret chest X-rays and assist physicians in diagnosing respiratory pathologies, with the aim of improving access to diagnosis in remote areas.

Significance. A validated mobile AI system for chest X-ray interpretation could have practical value for underserved regions, but the manuscript supplies no datasets, architectures, training procedures, or performance metrics, so the work does not advance the field.

major comments (2)

The abstract asserts that an ANN-based smartphone app has been developed to assist diagnosis, yet the manuscript reports neither the source or size of any X-ray training/validation data, the model architecture, the training procedure, nor any quantitative results (accuracy, sensitivity, specificity, or AUC) on held-out data. This absence is load-bearing for the central claim.
No evidence is provided that the model can run inference acceptably on mobile hardware or that the app delivers useful diagnostic assistance, which directly falsifies the premise that the system supports physicians.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the comments. The manuscript presents a high-level description of a proposed mobile AI system for chest X-ray interpretation and does not contain the requested implementation details or validation results. We will revise to align claims with the actual content provided.

read point-by-point responses

Referee: The abstract asserts that an ANN-based smartphone app has been developed to assist diagnosis, yet the manuscript reports neither the source or size of any X-ray training/validation data, the model architecture, the training procedure, nor any quantitative results (accuracy, sensitivity, specificity, or AUC) on held-out data. This absence is load-bearing for the central claim.

Authors: We agree the manuscript contains none of these elements. The text describes the overall goal and approach at a conceptual level without reporting experiments. We will revise the abstract, claims of development, and conclusions to present the work as a proposed methodology and planned app rather than a completed system with results. revision: yes
Referee: No evidence is provided that the model can run inference acceptably on mobile hardware or that the app delivers useful diagnostic assistance, which directly falsifies the premise that the system supports physicians.

Authors: We agree no such evidence or benchmarks appear in the manuscript. The submission does not address mobile runtime performance or clinical utility studies. We will revise the language to describe these as intended future steps and remove any implication of current physician support. revision: yes

standing simulated objections not resolved

The manuscript does not include any datasets, model architecture, training details, performance metrics, or mobile inference results, so these cannot be supplied.

Circularity Check

0 steps flagged

No derivation chain or fitted parameters present; high-level project description only.

full rationale

The manuscript is a brief project overview stating the intent to develop an ANN-based smartphone app for chest X-ray diagnosis. No equations, model architectures, training procedures, performance metrics, datasets, or self-citations appear in the provided text. No 'prediction' or 'first-principles result' is claimed that could reduce to its inputs by construction. The absence of any load-bearing derivation means the circularity patterns (self-definitional, fitted-input-called-prediction, etc.) do not apply. This is the expected honest non-finding for a non-technical project summary.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are mentioned or required by the abstract-level description.

pith-pipeline@v0.9.0 · 5592 in / 843 out tokens · 18411 ms · 2026-05-25T15:26:11.173940+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages · 3 internal anchors

[1]

Deep learning applications in medical image analysis

Ker, Justin, et al. Deep learning applications in medical image analysis. IEEE Access 6 (2018): 9375-9389. https://ieeexplore.ieee.org/abstract/document/8241753/

work page arXiv 2018
[2]

See https://stanfordmlgroup.github.io/projects/chexnet/ and the paper in arXiV: https://arxiv.org/abs/1711.05225

work page internal anchor Pith review Pith/arXiv arXiv
[3]

See https://docs.fast.ai and documentation in these webpages

work page
[4]

See https://ml-xray.herokuapp.com for a beta version

work page
[5]

See https://github.com/FFFreitas/X-ray-and-ML

work page
[6]

ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases

Wang, Xiaosong, Peng, Yifan, Lu, Le, Lu, Zhiyong, Bagheri, Mohammadhadi, and Summers, Ronald M, Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classiﬁcation and localization of common thorax diseases. arXiv preprint arXiv:1705.02315, (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[7]

Leslie N Smith, Cyclical learning rates for training neural networks, In Applications of Computer Vision (WACV), 2017 IEEE Winter ConferenceIEEE, (2017) 464â 472

work page 2017
[8]

Paulius Micikevicius and Sharan Narang and Jonah Alben and Gregory Frederick Diamos and Erich Elsen and David García and Boris Ginsburg and Michael Houston and Oleksii Kuchaiev and Ganesh Venkatesh and Hao Wu,Mixed Precision Training, CoRR, abs/1710.03740,(2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[9]

Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization, 3rd ICLR, (2015)

Diederik P. Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization, 3rd ICLR, (2015)

work page 2015
[10]

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov, Dropout: A simple way to prevent neural networks from overﬁtting, The Journal of Machine Learning Research, 15(1),(2014) 1929

work page 2014
[11]

Nielsen, Frank; Sun, Ke, Guaranteed bounds on the Kullback-Leibler divergence of univariate mixtures using piecewise log-sum-exp inequalities, CoRR, 18(12),(2016) 442

work page 2016
[12]

http://openaccess.thecvf.com/content_ICCV_2017/papers/Selvaraju_ Grad-CAM_Visual_Explanations_ICCV_2017_paper.pdf – 27 –

work page

[1] [1]

Deep learning applications in medical image analysis

Ker, Justin, et al. Deep learning applications in medical image analysis. IEEE Access 6 (2018): 9375-9389. https://ieeexplore.ieee.org/abstract/document/8241753/

work page arXiv 2018

[2] [2]

See https://stanfordmlgroup.github.io/projects/chexnet/ and the paper in arXiV: https://arxiv.org/abs/1711.05225

work page internal anchor Pith review Pith/arXiv arXiv

[3] [3]

See https://docs.fast.ai and documentation in these webpages

work page

[4] [4]

See https://ml-xray.herokuapp.com for a beta version

work page

[5] [5]

See https://github.com/FFFreitas/X-ray-and-ML

work page

[6] [6]

ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases

Wang, Xiaosong, Peng, Yifan, Lu, Le, Lu, Zhiyong, Bagheri, Mohammadhadi, and Summers, Ronald M, Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classiﬁcation and localization of common thorax diseases. arXiv preprint arXiv:1705.02315, (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[7] [7]

Leslie N Smith, Cyclical learning rates for training neural networks, In Applications of Computer Vision (WACV), 2017 IEEE Winter ConferenceIEEE, (2017) 464â 472

work page 2017

[8] [8]

Paulius Micikevicius and Sharan Narang and Jonah Alben and Gregory Frederick Diamos and Erich Elsen and David García and Boris Ginsburg and Michael Houston and Oleksii Kuchaiev and Ganesh Venkatesh and Hao Wu,Mixed Precision Training, CoRR, abs/1710.03740,(2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[9] [9]

Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization, 3rd ICLR, (2015)

Diederik P. Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization, 3rd ICLR, (2015)

work page 2015

[10] [10]

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov, Dropout: A simple way to prevent neural networks from overﬁtting, The Journal of Machine Learning Research, 15(1),(2014) 1929

work page 2014

[11] [11]

Nielsen, Frank; Sun, Ke, Guaranteed bounds on the Kullback-Leibler divergence of univariate mixtures using piecewise log-sum-exp inequalities, CoRR, 18(12),(2016) 442

work page 2016

[12] [12]

http://openaccess.thecvf.com/content_ICCV_2017/papers/Selvaraju_ Grad-CAM_Visual_Explanations_ICCV_2017_paper.pdf – 27 –

work page