Recognition: 2 theorem links
· Lean TheoremStress Estimation in Elderly Oncology Patients Using Visual Wearable Representations and Multi-Instance Learning
Pith reviewed 2026-05-10 17:43 UTC · model grok-4.3
The pith
Wearable sensor data enables moderate prediction of perceived stress in elderly breast cancer patients.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Transforming multimodal wearable streams into heterogeneous visual representations, embedding them with a pretrained Tiny-BioMoE into 192-dimensional vectors, and aggregating via attention-based multiple instance learning enables prediction of Perceived Stress Scale scores that show moderate agreement with questionnaire results (R²=0.24 at month 3 and R²=0.28 at month 6) under leave-one-subject-out evaluation in an elderly multicenter breast cancer cohort.
What carries the argument
Attention-based multiple instance learning aggregator operating on 192-dimensional embeddings from a lightweight pretrained mixture-of-experts model applied to visual representations of physical activity, sleep, and ECG data.
Load-bearing premise
That heterogeneous visual representations derived from multimodal wearable streams contain sufficient generalizable information about perceived stress to support accurate prediction under weak supervision in a new elderly oncology cohort.
What would settle it
Repeating the leave-one-subject-out protocol on data from an independent elderly oncology cohort and finding correlations below 0.3 with actual Perceived Stress Scale scores would show the predictions do not generalize.
Figures
read the original abstract
Psychological stress is clinically relevant in cardio-oncology, yet it is typically assessed only through patient-reported outcome measures (PROMs) and is rarely integrated into continuous cardiotoxicity surveillance. We estimate perceived stress in an elderly, multicenter breast cancer cohort (CARDIOCARE) using multimodal wearable data from a smartwatch (physical activity and sleep) and a chest-worn ECG sensor. Wearable streams are transformed into heterogeneous visual representations, yielding a weakly supervised setting in which a single Perceived Stress Scale (PSS) score corresponds to many unlabeled windows. A lightweight pretrained mixture-of-experts backbone (Tiny-BioMoE) embeds each representation into 192-dimensional vectors, which are aggregated via attention-based multiple instance learning (MIL) to predict PSS at month 3 (M3) and month 6 (M6). Under leave-one-subject-out (LOSO) evaluation, predictions showed moderate agreement with questionnaire scores (M3: R^2=0.24, Pearson r=0.42, Spearman rho=0.48; M6: R^2=0.28, Pearson r=0.49, Spearman rho=0.52), with global RMSE/MAE of 6.62/6.07 at M3 and 6.13/5.54 at M6.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents a machine learning pipeline for estimating perceived stress in elderly oncology patients from multimodal wearable sensor data. Wearable streams (physical activity, sleep from smartwatch; ECG from chest sensor) are converted to heterogeneous visual representations, embedded using a pretrained Tiny-BioMoE model into 192-dimensional vectors, and aggregated using attention-based multiple instance learning (MIL) to predict single Perceived Stress Scale (PSS) scores per subject at months 3 and 6. Under leave-one-subject-out evaluation on the CARDIOCARE cohort, the model achieves moderate agreement with questionnaire scores (R²=0.24 at M3, R²=0.28 at M6).
Significance. If the results hold after addressing potential confounds, this approach could contribute to integrating continuous stress assessment into cardio-oncology monitoring, reducing reliance on infrequent PROMs. The use of visual representations and MIL for weak supervision is a reasonable adaptation for the setting where only one label per subject is available. However, the moderate R² values limit immediate clinical impact, and the significance hinges on demonstrating that predictions are driven by stress-related patterns rather than subject-level covariates.
major comments (2)
- [Abstract / Evaluation section] Abstract / Evaluation: The reported metrics show only moderate explanatory power (R²=0.24–0.28), which is equivalent to limited predictive utility; this undermines the claim of effective stress estimation unless accompanied by ablation studies showing improvement over baselines that use only non-stress covariates such as age or treatment stage.
- [Methods] Methods (visual representation and MIL): No details are provided on the construction of the heterogeneous visual representations from wearable streams, the training procedure for the MIL head, hyperparameter selection, or any checks for confounding factors (e.g., age, treatment effects) that are constant within subjects but vary across LOSO folds. This is load-bearing because the concern that attention may latch onto spurious correlations cannot be ruled out without such controls.
minor comments (2)
- [Abstract] The abstract mentions 'global RMSE/MAE' but does not specify if these are computed per-subject or aggregated; clarify the exact evaluation protocol.
- [Results] Consider adding a table comparing to simple baselines (e.g., mean predictor or linear regression on subject metadata) to contextualize the R² values.
Simulated Author's Rebuttal
We appreciate the referee's insightful comments, which have helped us improve the manuscript. We address each major comment below and have made revisions to strengthen the evaluation and methods sections.
read point-by-point responses
-
Referee: [Abstract / Evaluation section] Abstract / Evaluation: The reported metrics show only moderate explanatory power (R²=0.24–0.28), which is equivalent to limited predictive utility; this undermines the claim of effective stress estimation unless accompanied by ablation studies showing improvement over baselines that use only non-stress covariates such as age or treatment stage.
Authors: We agree that the moderate R² values warrant further validation to confirm the model's reliance on stress-related signals. Accordingly, we have conducted ablation experiments in the revised manuscript. These include training and evaluating baseline models using only subject-level covariates (age, cancer treatment stage, and other demographic factors) under the same leave-one-subject-out protocol. The results demonstrate that the full model, which incorporates the visual embeddings from wearable data, achieves higher R² and correlation metrics compared to the covariate-only baselines. This supports that the predictions are not solely driven by non-stress covariates. We have also clarified in the evaluation section that while the absolute performance is moderate, the relative improvement highlights the value of the wearable-based approach in this challenging setting. revision: yes
-
Referee: [Methods] Methods (visual representation and MIL): No details are provided on the construction of the heterogeneous visual representations from wearable streams, the training procedure for the MIL head, hyperparameter selection, or any checks for confounding factors (e.g., age, treatment effects) that are constant within subjects but vary across LOSO folds. This is load-bearing because the concern that attention may latch onto spurious correlations cannot be ruled out without such controls.
Authors: We concur that expanded methodological transparency is essential. The revised Methods section now provides: (i) explicit details on generating the heterogeneous visual representations, such as converting activity counts to bar plots, sleep metrics to timeline visualizations, and ECG signals to spectrograms or waveform images; (ii) the full specification of the MIL head, including the attention pooling mechanism, the prediction head architecture, and the training objective (e.g., mean squared error loss with Adam optimizer); (iii) the hyperparameter selection strategy, which involved a grid search over learning rates, attention dimensions, and number of experts in the backbone, validated on a held-out subset; and (iv) confounding checks, including Pearson correlations between model predictions and covariates, as well as retraining with covariates concatenated to the embeddings to assess if they explain additional variance. These controls help mitigate concerns about spurious subject-level correlations in the LOSO evaluation. revision: yes
Circularity Check
No circularity: purely empirical ML pipeline evaluated on external labels
full rationale
The manuscript describes a standard supervised learning pipeline: wearable streams are converted to heterogeneous visual representations, embedded by a pretrained Tiny-BioMoE model into 192-dim vectors, aggregated by attention MIL, and regressed against single per-subject PSS questionnaire scores. Evaluation uses LOSO cross-validation with reported R², Pearson, Spearman, RMSE and MAE metrics against those independent ground-truth labels. No equations, derivations, fitted parameters renamed as predictions, self-citations invoked as uniqueness theorems, or ansatzes appear in the provided text. The central claim therefore rests on empirical correlation with external data rather than any tautological reduction of outputs to inputs by construction.
Axiom & Free-Parameter Ledger
free parameters (1)
- MIL attention weights
axioms (1)
- domain assumption Wearable-derived visual representations encode patterns correlated with self-reported psychological stress
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
A lightweight pretrained mixture-of-experts backbone (Tiny-BioMoE) embeds each representation into 192-dimensional vectors, which are aggregated via attention-based multiple instance learning (MIL) to predict PSS
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Under leave-one-subject-out (LOSO) evaluation, predictions showed moderate agreement with questionnaire scores (M3: R^2=0.24, Pearson r=0.42, Spearman rho=0.48)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Protective and damaging effects of stress mediators,
B. S. McEwen, “Protective and damaging effects of stress mediators,” New England Journal of Medicine, vol. 338, no. 3, pp. 171–179, 1998, doi: 10.1056/NEJM199801153380307
-
[2]
Stress and cardiovascular disease,
A. Steptoe and M. Kivim ¨aki, “Stress and cardiovascular disease,” Nature Reviews Cardiology, vol. 9, no. 6, pp. 360–370, 2012, doi: 10.1038/nrcardio.2012.45
-
[3]
A. M. Paslaruet al., “Mind over malignancy: A systematic review and meta-analysis of psychological distress, coping, and therapeutic interventions in oncology,”Medicina, vol. 61, no. 6, p. 1086, 2025, doi: 10.3390/medicina61061086
-
[4]
A. R. Lyonet al., “Baseline cardiovascular risk assessment in cancer patients scheduled to receive cardiotoxic cancer therapies: A position statement and new risk assessment tools from the Cardio-Oncology Study Group of the Heart Failure Association of the European Society of Cardiology in collaboration with the International Cardio-Oncology Society,”Euro...
-
[5]
Cardiotoxicity in elderly breast cancer patients,
K. Keramida et al., “Cardiotoxicity in elderly breast cancer patients,” Cancers, vol. 17, no. 13, p. 2198, 2025, doi: 10.3390/cancers17132198
-
[6]
Rozanski, A., Blumenthal, J. A., and Kaplan, J. (1999). Impact of psychological factors on the pathogenesis of cardiovascular disease. Circulation, 99(16), 2192–2217. doi: 10.1161/01.CIR.99.16.2192
-
[7]
Reduced heart rate variability and mortality risk
Tsuji, H.et al., (1996). Reduced heart rate variability and mortality risk. Circulation, 94(11), 2850–2855. doi: 10.1161/01.CIR.94.11.2850
-
[8]
M. H. Antoniet al., “Stress management interventions to facilitate psychological and physiological adaptation and optimal health outcomes in cancer patients and survivors,”Annual Review of Psychology, vol. 74, pp. 423–455, 2023, doi: 10.1146/annurev-psych-030122-124119
-
[9]
Detec- tion and monitoring of stress using wearables: A systematic review,
A. Pinge, V . Gad, D. Jaisighani, S. Ghosh, and S. Sen, “Detec- tion and monitoring of stress using wearables: A systematic review,” Frontiers in Computer Science, vol. 6, Art. no. 1478851, 2024, doi: 10.3389/fcomp.2024.1478851
-
[10]
Large-scale wearable data reveal digital phenotypes for daily-life stress detection,
E. Smets et al., “Large-scale wearable data reveal digital phenotypes for daily-life stress detection,” npj Digital Medicine, vol. 1, no. 1, p. 67, 2018, doi: 10.1038/s41746-018-0074-9
-
[11]
Wearables and the medical revolu- tion,
J. Dunn, R. Runge, and M. Snyder, “Wearables and the medical revolu- tion,” Per. Med., vol. 15, no. 5, pp. 429–448, 2018, doi: 10.2217/pme- 2018-0044
-
[12]
S. Yanget al., “A deep learning approach to stress recognition through multimodal physiological signal image transformation,”Scientific Re- ports, vol. 15, art. no. 22258, 2025, doi: 10.1038/s41598-025-01228-3
-
[13]
S. Gkikas, I. Kyprakis, and M. Tsiknakis, “Multi-representation dia- grams for pain recognition: Integrating various electrodermal activity signals into a single image,” inCompanion Proc. 27th Int. Conf. on Multimodal Interaction (ICMI Companion), New York, NY , USA: ACM, 2025, pp. 162–171, doi: 10.1145/3747327.3764793
-
[14]
Dn-splatter: Depth and normal priors for gaussian splatting and meshing
S. Ziaratnia, T. Laohakangvalvit, M. Sugaya, and P. Sripian, “Mul- timodal deep learning for remote stress estimation using CCT- LSTM,” inProc. IEEE/CVF Winter Conf. on Applications of Com- puter Vision (WACV), Waikoloa, HI, USA, 2024, pp. 8321–8329, doi: 10.1109/W ACV57701.2024.00815
work page doi:10.1109/w 2024
-
[15]
G. V os, K. Trinh, Z. Sarnyai, and M. R. Azghadi, “Generalizable machine learning for stress monitoring from wearable devices: A sys- tematic literature review,”International Journal of Medical Informatics, vol. 173, p. 105026, 2023, doi: 10.1016/j.ijmedinf.2023.105026
-
[16]
Attention-based deep multiple instance learning,
M. Ilse, J. M. Tomczak, and M. Welling, “Attention-based deep multiple instance learning,” in Proceedings of the 35th International Confer- ence on Machine Learning (ICML), Stockholm, Sweden, 2018, pp. 2127–2136
2018
-
[17]
Toward Efficient Inference for Mixture of Experts,
H. Huang, N. Ardalani, A. Sun, L. Ke, H.-H. S. Lee, S. Bhosale, C.-J. Wu, and B. Lee, “Toward Efficient Inference for Mixture of Experts,” in Proc. 38th Conf. Neural Information Processing Systems (NeurIPS), 2024
2024
-
[18]
Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity,
W. Fedus, B. Zoph, and N. Shazeer, “Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity,”J. Mach. Learn. Res., vol. 23, no. 1, art. no. 120, pp. 1–39, Jan. 2022
2022
-
[19]
Home – CARDIOCARE
“Home – CARDIOCARE.” [Online]. Available: https://cardiocare- project.eu/ Accessed: Dec. 2025
2025
-
[20]
L. C. Nechita,et al., “AI and smart devices in cardio-oncology: Advance- ments in cardiotoxicity prediction and cardiovascular monitoring,”Diag- nostics, vol. 15, no. 6, p. 787, 2025, doi: 10.3390/diagnostics15060787
-
[21]
A. Calvo, J. Martin, and C. Martin, “Early detection of chronic stress using wearable devices: A machine learning approach with the WESAD database,” inProc. 11th Int. Conf. Information and Communication Technologies for Ageing Well and e-Health (ICT4AWE), 2025, pp. 189–196, doi: 10.5220/0013209700003938
-
[22]
M. B. Bin Heyatet al., “Wearable flexible electronics based cardiac electrode for researcher mental stress detection system using machine learning models on single lead electrocardiogram signal,” Biosensors, vol. 12, no. 6, p. 427, 2022, doi: 10.3390/bios12060427
-
[23]
Extending Stress Detection Reproducibility to Consumer Wearable Sensors,
O. B. Amin, V . Mishra, T. M. Tapera, R. V olpe, and A. Sathyanarayana, “Extending Stress Detection Reproducibility to Consumer Wearable Sensors,”arXiv preprint arXiv:2505.05694, 2025. [Online]. Available: https://arxiv.org/abs/2505.05694
-
[24]
G. J. Martinez, T. Grover, S. M. Mattingly,et al., “Alignment between heart rate variability from fitness trackers and perceived stress: Per- spectives from a large-scale in situ longitudinal study of information workers,”JMIR Human Factors, vol. 9, no. 3, p. e33754, Aug. 2022, doi: 10.2196/33754
-
[25]
P. Arya,et al., “Visualizing relaxation in wearables: Multi-domain feature fusion of HRV using fuzzy recurrence plots,”Sensors, vol. 25, no. 13, p. 4210, Jul. 2025, doi: 10.3390/s25134210
-
[26]
Venu SQ smartwatch,
Garmin Ltd., “Venu SQ smartwatch,” [Online]. Available: https://www.garmin.com/en-US/p/707174/. Accessed: Dec. 2025
2025
-
[27]
Polar H10 heart rate sensor,
Polar Electro Oy, “Polar H10 heart rate sensor,” [Online]. Available: https://www.polar.com/en/sensors/h10-heart-rate-sensor. Accessed: Dec. 2025
2025
-
[28]
A global measure of perceived stress,
S. Cohen, T. Kamarck, and R. Mermelstein, “A global measure of perceived stress,”Journal of Health and Social Behavior, vol. 24, no. 4, pp. 385–396, 1983, doi: 10.2307/2136404
-
[29]
D. Makowskiet al., “NeuroKit2: A Python toolbox for neurophysio- logical signal processing,”Behavior Research Methods, vol. 53, no. 4, pp. 1689–1696, 2021, doi: 10.3758/s13428-020-01516-y
-
[30]
Accessed: Dec
Google LLC, “Gemini,” AI image generation system. Accessed: Dec
-
[31]
Available: https://gemini.google.com/
[Online]. Available: https://gemini.google.com/
-
[32]
Tiny-BioMoE: A Lightweight Embedding Model for Biosignal Analysis,
S. Gkikas, I. Kyprakis, and M. Tsiknakis, “Tiny-BioMoE: A Lightweight Embedding Model for Biosignal Analysis,” inCompanion Proceedings of the 27th International Conference on Multimodal Interaction (ICMI Companion), New York, NY , USA: ACM, 2025, pp. 117–126, doi: 10.1145/3747327.3764788
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.