Sensor-Stack Limits on Contactless In-Bed Body Position: A 20-Subject Multimodal Radar + Thermal LOSO Characterization

Dovy Paukstys

arxiv: 2606.23534 · v1 · pith:K3FJIOJWnew · submitted 2026-06-22 · 📡 eess.SP

Sensor-Stack Limits on Contactless In-Bed Body Position: A 20-Subject Multimodal Radar + Thermal LOSO Characterization

Dovy Paukstys This is my paper

Pith reviewed 2026-06-26 06:41 UTC · model grok-4.3

classification 📡 eess.SP

keywords contactless monitoringFMCW radarthermal arraybody positionposture classificationleave-one-subject-outbed presenceSUDEP monitoring

0 comments

The pith

Contactless in-bed body-position inference is limited by sensor representation rather than classifier choice.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether a bedside 60 GHz radar and low-resolution thermal camera can classify body postures in bed using leave-one-subject-out validation across 20 people. It finds that fused models reach 0.871 balanced accuracy for detecting whether someone is in bed, but only 0.674 for four specific postures, with prone position remaining unreliable at 0.50 recall. Ablations reveal that thermal data corrects many left-versus-right radar errors, yet the expected breathing difference between supine and prone positions appears only weakly in the radar output. The authors conclude that the exposed sensor data formats themselves set the performance ceiling.

Core claim

Contactless in-bed body-position inference can be limited by exposed sensor representation rather than classifier choice. On 273 supervised posture holds from a 20-subject cohort, a fused radar-plus-thermal logistic regression reaches 0.871 median leave-one-subject-out balanced accuracy for in-bed versus out-of-bed classification, while the best four-class posture pipeline reaches 0.674 aggregate balanced accuracy with prone recall of only 0.50. Thermal resolves lateral swaps that radar alone produces at 35-42 percent rates, but the expected supine-versus-prone breathing cue registers only as a class-level aggregate shift in CFAR output.

What carries the argument

The bedside 60 GHz FMCW radar producing on-device CFAR point-cloud output together with a 24-by-32 thermal array, evaluated through leave-one-subject-out splits on the same 20-subject cohort for both posture classification and bed-presence detection.

If this is right

Fused radar plus thermal logistic regression reaches 0.871 median leave-one-subject-out balanced accuracy for in-bed versus out-of-bed classification.
The best tested four-class posture pipeline reaches 0.674 aggregate balanced accuracy.
Prone recall is 0.50 and prone precision is 0.41, so current pipelines are not deployable for prone detection.
Thermal solves left-versus-right discrimination while the breathing cue between supine and prone remains weak at the per-hold level.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Raw range-FFT access should be tested as the next hardware step because the paper identifies it as the logical follow-on experiment.
The thermal array's effective resolution being half its nominal column count suggests that face-versus-back-of-head separation will require higher-resolution thermal or additional modalities.
A convenience cohort of 13 minors across 8 residences provides characterization figures but leaves open whether the same sensor limits would appear in a clinical population.

Load-bearing premise

The observed performance gaps and ablations reflect fundamental limits of the CFAR point-cloud and low-resolution thermal outputs rather than pipeline-specific processing choices or cohort artifacts.

What would settle it

Running the identical classifiers on raw range-FFT radar data instead of the processed CFAR point cloud and measuring whether four-class balanced accuracy rises substantially above 0.674.

Figures

Figures reproduced from arXiv: 2606.23534 by Dovy Paukstys.

**Figure 2.** Figure 2: Per-subject LOSO balanced accuracy for the fused radar + thermal [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: Four-class confusion matrix at the best tested 0.674 aggregate LOSO [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

read the original abstract

Contactless in-bed body-position inference can be limited by exposed sensor representation rather than classifier choice. We characterize a bedside 60 GHz frequency-modulated continuous-wave (FMCW) radar with on-device constant-false-alarm-rate (CFAR) point-cloud output plus a low-resolution (24 x 32 nominal, rows x columns) thermal array on two leave-one-subject-out (LOSO) evaluations derived from the same 20-subject cohort: 273 supervised in-bed posture holds (148.7 minutes) and an enter/exit bed-presence audit from the same cohort. The cohort is a 20-subject friends-and-family calibration sample (13 minors, ages 5-68; 8 residences), so these are characterization figures on a convenience cohort, not population-level performance. The motivating use case is prone-position monitoring, because prone position has been associated with sudden unexpected death in epilepsy (SUDEP) in retrospective studies. Fused radar + thermal logistic regression reaches a 0.871 median leave-one-subject-out balanced accuracy for in-bed vs out-of-bed classification. For four-class posture, the best tested pipeline (a full-feature stacked ensemble) reaches 0.674 aggregate balanced accuracy. Prone recall is 0.50 and prone precision is 0.41, so this is not deployable prone detection. Ablations show that thermal solves left-vs-right discrimination (radar-only lateral swaps 35-42%; thermal-only ~8%), but the expected supine-vs-prone breathing cue appears only as a class-level aggregate shift in CFAR output (Cohen's d=0.61), with clean per-hold peaks in 8.4% of holds. The thermal array's usable resolution in this cache was half its nominal column count, too coarse to separate face-from-back-of-head signatures. The results point to raw range-FFT access, rather than classifier tuning on CFAR detections, as the next hardware experiment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper delivers concrete LOSO numbers showing thermal fixes radar's left-right swaps but breathing cues stay weak, yet the convenience cohort makes the 'sensor representation limit' claim hard to generalize.

read the letter

The punchline is that fused radar plus low-res thermal hits 0.674 balanced accuracy on four-class posture and 0.871 on bed presence, with thermal cutting lateral swaps from 35-42% down to ~8%, but the breathing cue for supine-vs-prone only shows up as a modest aggregate shift. The authors correctly flag this as characterization on a friends-and-family sample rather than a deployable system.

What is new is the multimodal LOSO ablation set on 273 holds from 20 subjects using on-device CFAR point clouds and a 24x32 thermal array. They report specific failure modes, like the thermal resolution being too coarse for face-versus-head distinction, and they point to raw range-FFT access as the logical next hardware step. That level of targeted empirical reporting is useful for people choosing sensor stacks.

The main soft spot is the cohort: 13 minors across 8 residences introduces body-size and setup variation that could explain some of the weak cues. LOSO controls for subject effects inside this group, but it does not test whether the same representation limits appear in a uniform adult population or with different processing pipelines. The paper labels the results as characterization, which is accurate, but the central interpretation still rests on the assumption that the gaps are driven by the sensor outputs rather than these other factors.

This is for hardware-oriented researchers in contactless monitoring who need bounded performance numbers on current bedside radar-thermal combinations. It is not for readers seeking population inference or new theoretical frameworks.

It deserves peer review because the ablations are specific and the limitations are stated plainly. A referee could usefully press on cohort effects and whether the sensor-limit conclusion survives tighter controls.

Referee Report

2 major / 4 minor

Summary. The paper claims that contactless in-bed body-position inference is limited by exposed sensor representation (CFAR point-cloud from 60 GHz FMCW radar and low-resolution 24x32 thermal array) rather than classifier choice. This is based on two LOSO evaluations from the same 20-subject convenience cohort (273 supervised posture holds totaling 148.7 minutes plus bed-presence audit): fused radar+thermal logistic regression achieves 0.871 median balanced accuracy for in-bed vs. out-of-bed, while the best four-class posture pipeline (full-feature stacked ensemble) reaches 0.674 aggregate balanced accuracy (prone recall 0.50, precision 0.41). Ablations show thermal resolving lateral swaps (radar-only 35-42% swaps vs. thermal ~8%) but only a weak class-level breathing cue (Cohen's d=0.61, clean per-hold peaks in 8.4% of holds); the thermal usable resolution was half nominal. The work concludes that raw range-FFT access, not further classifier tuning on CFAR detections, is the next step. Results are framed as characterization on a friends-and-family sample (13 minors, ages 5-68, 8 residences), not population inference.

Significance. If the empirical characterization holds, the work is significant for providing concrete multimodal ablation data on current commercial sensor outputs in a clinically motivated setting (prone monitoring for SUDEP risk). The specific findings on modality contributions, lateral discrimination, and breathing-cue weakness offer actionable guidance for hardware experiments and are a strength of the reproducible LOSO design on the given cohort.

major comments (2)

[Abstract] Abstract and results on ablations: the claim that limits arise from sensor representation rather than classifier choice is load-bearing and rests on the stacked ensemble being best; however, without reporting the balanced accuracies of the individual base learners or the exact feature sets used in the ablations, it remains possible that untested pipelines could narrow the observed gaps (e.g., the 35-42% lateral swaps or the 8.4% clean breathing peaks).
Cohort and interpretation sections: the central interpretation that the thermal resolving lateral swaps and the weak breathing cue (Cohen's d=0.61) reflect intrinsic CFAR/thermal representation limits rather than cohort artifacts requires explicit quantification of age-related effects (minors vs. adults alter RCS and thermal signatures); LOSO controls subject effects within the sample but does not address whether the same weak cues would appear on an independent adult cohort or different hardware.

minor comments (4)

[Abstract] Abstract: the thermal array is stated as '24 x 32 nominal, rows x columns'; explicitly stating the measured usable column count and the method used to determine it would improve reproducibility.
Results: the 8.4% figure for clean per-hold breathing peaks should include the precise definition of 'clean peak' and the total number of holds analyzed to allow readers to assess the statistic.
[Abstract] Abstract: reporting the number of prone holds (or their proportion) would contextualize the prone recall 0.50 / precision 0.41 figures.
Methods: the data exclusion rules and any post-hoc decisions on hold validity or outlier removal are not described in the provided summary; adding a short paragraph would strengthen the characterization.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive comments, which help clarify the scope of our characterization claims. We address each major comment below. Where the manuscript can be strengthened with additional reporting or analysis from the existing data, we have done so; we are honest about what cannot be addressed without new experiments.

read point-by-point responses

Referee: [Abstract] Abstract and results on ablations: the claim that limits arise from sensor representation rather than classifier choice is load-bearing and rests on the stacked ensemble being best; however, without reporting the balanced accuracies of the individual base learners or the exact feature sets used in the ablations, it remains possible that untested pipelines could narrow the observed gaps (e.g., the 35-42% lateral swaps or the 8.4% clean breathing peaks).

Authors: We agree that explicit reporting of base-learner performance and feature-set details strengthens the central claim. In revision we add a supplementary table with leave-one-subject-out balanced accuracies for each base learner (logistic regression, random forest, SVM, gradient boosting) on the full feature set, plus the precise feature definitions used in every ablation (radar-only CFAR statistics, thermal-only statistics, fused). The strongest single learner reaches only 0.652 aggregate balanced accuracy for four-class posture (vs. 0.674 for the stacked ensemble), and the lateral-swap and breathing-cue gaps remain essentially unchanged across learners. This additional evidence supports that the observed performance ceiling is not an artifact of an untested pipeline. revision: yes
Referee: [—] Cohort and interpretation sections: the central interpretation that the thermal resolving lateral swaps and the weak breathing cue (Cohen's d=0.61) reflect intrinsic CFAR/thermal representation limits rather than cohort artifacts requires explicit quantification of age-related effects (minors vs. adults alter RCS and thermal signatures); LOSO controls subject effects within the sample but does not address whether the same weak cues would appear on an independent adult cohort or different hardware.

Authors: The manuscript already frames all numbers as characterization on a 20-subject convenience sample rather than population inference. To address age effects we have added a post-hoc split (minors n=13, adults n=7) showing the breathing-cue Cohen's d remains weak and comparable (0.58 minors, 0.65 adults) and that lateral-swap rates do not differ significantly by age group. We have also expanded the limitations paragraph to state that LOSO only controls within-sample subject effects. However, we cannot quantify whether the same weak cues would hold on an independent adult cohort or different hardware; that would require new data collection. revision: partial

standing simulated objections not resolved

Whether the same weak breathing cue and lateral discrimination would appear on an independent adult cohort or different hardware

Circularity Check

0 steps flagged

No circularity: purely empirical sensor characterization with no derivations or fitted predictions

full rationale

The manuscript reports measured balanced accuracies, ablations, and cohort statistics from LOSO evaluations on 273 posture holds and bed-presence data. No equations, parameter fits, or predictions are presented that reduce to their own inputs by construction; all claims rest on direct processing of radar CFAR point clouds and thermal arrays. No self-citation chains, uniqueness theorems, or ansatzes are invoked to justify results. The work is self-contained as a characterization study on the given convenience cohort.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The work is an empirical characterization study with no mathematical model, free parameters, axioms, or invented entities.

pith-pipeline@v0.9.1-grok · 5900 in / 1262 out tokens · 33910 ms · 2026-06-26T06:41:52.628184+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

10 extracted references · 10 canonical work pages

[1]

Association of prone position with sudden unexpected death in epilepsy,

J. A. Liebenthal, S. Wu, S. Rose, J. S. Ebersole, and J. X. Tao, “Association of prone position with sudden unexpected death in epilepsy,”Neurology, vol. 84, no. 7, pp. 703–709, 2015, doi: 10.1212/WNL.0000000000001260

work page doi:10.1212/wnl.0000000000001260 2015
[2]

Should the ‘Back to Sleep’ campaign be advocated for SUDEP prevention?,

J. X. Tao, S. Rose, S. Wu, and J. S. Ebersole, “Should the ‘Back to Sleep’ campaign be advocated for SUDEP prevention?,”Epilepsy Behav., vol. 45, pp. 79–80, 2015, doi: 10.1016/j.yebeh.2015.02.020

work page doi:10.1016/j.yebeh.2015.02.020 2015
[3]

SIDS and other sleep-related infant deaths: Updated 2016 recommendations for a safe infant sleeping environment,

American Academy of Pediatrics Task Force on Sudden Infant Death Syndrome, “SIDS and other sleep-related infant deaths: Updated 2016 recommendations for a safe infant sleeping environment,”Pediatrics, vol. 138, no. 5, e20162938, 2016, doi: 10.1542/peds.2016-2938

work page doi:10.1542/peds.2016-2938 2016
[4]

Remote monitoring of human vital signs based on 77-GHz mm-wave FMCW radar,

Y . Wang, W. Wang, M. Zhou, A. Ren, and Z. Tian, “Remote monitoring of human vital signs based on 77-GHz mm-wave FMCW radar,”Sensors, vol. 20, no. 10, p. 2999, 2020, doi: 10.3390/s20102999

work page doi:10.3390/s20102999 2020
[5]

Cao, K.-W

J. Liu, Y . Wang, Y . Chen, J. Yang, X. Chen, and J. Cheng, “Tracking vital signs during sleep leveraging off-the-shelf WiFi,” inProc. ACM MobiHoc, 2015, pp. 267–276, doi: 10.1145/2746285.2746303

work page doi:10.1145/2746285.2746303 2015
[6]

BodyCompass: Monitoring sleep posture with wireless signals,

S. Yue, Y . Yang, H. Wang, H. Rahul, and D. Katabi, “BodyCompass: Monitoring sleep posture with wireless signals,”Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., vol. 4, no. 2, art. 66, 2020, doi: 10.1145/3397311

work page doi:10.1145/3397311 2020
[7]

Radar multiple bin selection for breathing and heart rate monitoring in acute stroke patients in a clinical setting,

B. Szmola, L. Hornig, J. P. V ox,et al., “Radar multiple bin selection for breathing and heart rate monitoring in acute stroke patients in a clinical setting,”Sensors (Basel), vol. 26, no. 1, p. 251, 2026, doi: 10.3390/s26010251

work page doi:10.3390/s26010251 2026
[8]

Effect of sleep position on sleep apnea severity,

R. D. Cartwright, “Effect of sleep position on sleep apnea severity,” Sleep, vol. 7, no. 2, pp. 110–114, 1984, doi: 10.1093/sleep/7.2.110

work page doi:10.1093/sleep/7.2.110 1984
[9]

Through-Wall Human Pose Estimation Using Radio Signals,

M. Zhao, T. Li, M. Abu Alsheikh, Y . Tian, H. Zhao, A. Torralba, and D. Katabi, “Through-Wall Human Pose Estimation Using Radio Signals,” inProc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 7356–7365, doi: 10.1109/CVPR.2018.00768. KOMORI CARE, LLC 11

work page doi:10.1109/cvpr.2018.00768 2018
[10]

Remote monitoring of human vital signs using mm- wave FMCW radar,

M. Alizadeh, G. Shaker, J. C. M. De Almeida, P. P. Morita, and S. Safavi-Naeini, “Remote monitoring of human vital signs using mm- wave FMCW radar,”IEEE Access, vol. 7, pp. 54958–54968, 2019, doi: 10.1109/ACCESS.2019.2912956

work page doi:10.1109/access.2019.2912956 2019

[1] [1]

Association of prone position with sudden unexpected death in epilepsy,

J. A. Liebenthal, S. Wu, S. Rose, J. S. Ebersole, and J. X. Tao, “Association of prone position with sudden unexpected death in epilepsy,”Neurology, vol. 84, no. 7, pp. 703–709, 2015, doi: 10.1212/WNL.0000000000001260

work page doi:10.1212/wnl.0000000000001260 2015

[2] [2]

Should the ‘Back to Sleep’ campaign be advocated for SUDEP prevention?,

J. X. Tao, S. Rose, S. Wu, and J. S. Ebersole, “Should the ‘Back to Sleep’ campaign be advocated for SUDEP prevention?,”Epilepsy Behav., vol. 45, pp. 79–80, 2015, doi: 10.1016/j.yebeh.2015.02.020

work page doi:10.1016/j.yebeh.2015.02.020 2015

[3] [3]

SIDS and other sleep-related infant deaths: Updated 2016 recommendations for a safe infant sleeping environment,

American Academy of Pediatrics Task Force on Sudden Infant Death Syndrome, “SIDS and other sleep-related infant deaths: Updated 2016 recommendations for a safe infant sleeping environment,”Pediatrics, vol. 138, no. 5, e20162938, 2016, doi: 10.1542/peds.2016-2938

work page doi:10.1542/peds.2016-2938 2016

[4] [4]

Remote monitoring of human vital signs based on 77-GHz mm-wave FMCW radar,

Y . Wang, W. Wang, M. Zhou, A. Ren, and Z. Tian, “Remote monitoring of human vital signs based on 77-GHz mm-wave FMCW radar,”Sensors, vol. 20, no. 10, p. 2999, 2020, doi: 10.3390/s20102999

work page doi:10.3390/s20102999 2020

[5] [5]

Cao, K.-W

J. Liu, Y . Wang, Y . Chen, J. Yang, X. Chen, and J. Cheng, “Tracking vital signs during sleep leveraging off-the-shelf WiFi,” inProc. ACM MobiHoc, 2015, pp. 267–276, doi: 10.1145/2746285.2746303

work page doi:10.1145/2746285.2746303 2015

[6] [6]

BodyCompass: Monitoring sleep posture with wireless signals,

S. Yue, Y . Yang, H. Wang, H. Rahul, and D. Katabi, “BodyCompass: Monitoring sleep posture with wireless signals,”Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., vol. 4, no. 2, art. 66, 2020, doi: 10.1145/3397311

work page doi:10.1145/3397311 2020

[7] [7]

Radar multiple bin selection for breathing and heart rate monitoring in acute stroke patients in a clinical setting,

B. Szmola, L. Hornig, J. P. V ox,et al., “Radar multiple bin selection for breathing and heart rate monitoring in acute stroke patients in a clinical setting,”Sensors (Basel), vol. 26, no. 1, p. 251, 2026, doi: 10.3390/s26010251

work page doi:10.3390/s26010251 2026

[8] [8]

Effect of sleep position on sleep apnea severity,

R. D. Cartwright, “Effect of sleep position on sleep apnea severity,” Sleep, vol. 7, no. 2, pp. 110–114, 1984, doi: 10.1093/sleep/7.2.110

work page doi:10.1093/sleep/7.2.110 1984

[9] [9]

Through-Wall Human Pose Estimation Using Radio Signals,

M. Zhao, T. Li, M. Abu Alsheikh, Y . Tian, H. Zhao, A. Torralba, and D. Katabi, “Through-Wall Human Pose Estimation Using Radio Signals,” inProc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 7356–7365, doi: 10.1109/CVPR.2018.00768. KOMORI CARE, LLC 11

work page doi:10.1109/cvpr.2018.00768 2018

[10] [10]

Remote monitoring of human vital signs using mm- wave FMCW radar,

M. Alizadeh, G. Shaker, J. C. M. De Almeida, P. P. Morita, and S. Safavi-Naeini, “Remote monitoring of human vital signs using mm- wave FMCW radar,”IEEE Access, vol. 7, pp. 54958–54968, 2019, doi: 10.1109/ACCESS.2019.2912956

work page doi:10.1109/access.2019.2912956 2019