pith. sign in

arxiv: 2606.03050 · v1 · pith:BGGYHZWOnew · submitted 2026-06-02 · 💻 cs.CV

FCUS-rPPG: A Fast-Converging Unsupervised Framework for Remote Photoplethysmography via Gradient Oscillation Suppression

Pith reviewed 2026-06-28 11:09 UTC · model grok-4.3

classification 💻 cs.CV
keywords remote photoplethysmographyrPPGunsupervised learningBVP extractiongradient maskingcross-dataset generalizationfast convergenceloss landscape smoothing
0
0 comments X

The pith

FCUS-rPPG trains unsupervised remote photoplethysmography models in one epoch while reaching state-of-the-art cross-dataset performance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces FCUS-rPPG to solve slow convergence and weak cross-domain performance in unsupervised rPPG methods caused by noisy gradients. It starts from the structure of BVP representations and builds a spectrally shared backbone plus three coordinated optimization steps at the gradient, loss landscape, and feature levels. The framework reaches convergence after a single training epoch and outperforms prior methods on cross-dataset tests without any ground-truth labels. The approach matters because it removes the need for lengthy training or annotated data when deploying camera-based physiological monitoring.

Core claim

FCUS-rPPG establishes that a spectrally shared backbone, motivated by the multi-spectral covariation and low-dimensional manifold structure of BVP representations, together with post-verification gradient masking, perturbation-based loss-landscape smoothing, and noise-aware null-space regularization, jointly suppresses gradient oscillation to produce one-epoch convergence and strong cross-dataset generalization in unsupervised rPPG.

What carries the argument

The spectrally shared backbone that disentangles BVP features, paired with the three-level optimization of gradient masking, loss-landscape smoothing, and null-space regularization.

If this is right

  • Unsupervised rPPG training completes in one epoch rather than tens or hundreds.
  • State-of-the-art cross-dataset accuracy is obtained without physiological ground-truth labels.
  • The method supplies an efficient route to real-world camera-based BVP deployment.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same three-level stabilization may transfer to other unsupervised video-based physiological measurements that share manifold structure.
  • Single-epoch training opens the possibility of on-device fine-tuning of rPPG models after initial deployment.
  • Null-space regularization could be tested on additional noise sources such as motion or illumination changes to measure further robustness gains.

Load-bearing premise

BVP representations possess multi-spectral covariation and low-dimensional manifold structure that the backbone and optimization steps can directly exploit for stable and generalizable learning.

What would settle it

Train FCUS-rPPG for exactly one epoch on one dataset then evaluate on the remaining four; if cross-dataset accuracy falls below current unsupervised SOTA baselines, the single-epoch convergence and generalization claims are falsified.

Figures

Figures reproduced from arXiv: 2606.03050 by Jiajie Li, Juan Cheng, Rencheng Song, Xun Chen, Yu Liu.

Figure 1
Figure 1. Figure 1: SNR loss curves of the raw signals during training from scratch on [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The overall pipeline and key components of the proposed FCUS-rPPG framework. The framework comprises video preprocessing, a low-dimensional [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: (a) Multi-spectral physiological covariation and (b) low-dimensional [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Overall pipeline of the proposed Low-dimensional Spectrally-Shared [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Illustration of the Post-verification Gradient Masking (PGM) mecha [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: SNR loss curves of raw signals during single-epoch training across different datasets: (a) UBFC-rPPG, (b) PURE, (c) BSIPL-motion, and (d) BSIPL [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
Figure 9
Figure 9. Figure 9: Visualizations of BVP signals recovered by the FCUS-rPPG frame [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗
Figure 8
Figure 8. Figure 8: Cross-dataset evaluation of the accuracy-epochs tradeoff across [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗
Figure 11
Figure 11. Figure 11: Convergence curves of the SNR loss under two different random [PITH_FULL_IMAGE:figures/full_fig_p012_11.png] view at source ↗
Figure 10
Figure 10. Figure 10: Convergence curves of the raw input SNR loss during training on the [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗
read the original abstract

Remote photoplethysmography (rPPG) enables non-contact extraction of blood volume pulse (BVP) signals using consumer-grade cameras. Recent unsupervised rPPG methods learn BVP representations without requiring ground-truth physiological annotations, yet their optimization is often hindered by noisy and unstable gradients, resulting in slow convergence and limited cross-domain generalization. In this paper, we propose FCUS-rPPG, a fast-converging unsupervised rPPG framework with strong generalization capability. Motivated by the observation that BVP representations exhibit both multi-spectral covariation and low-dimensional manifold structure, we design a spectrally shared backbone that facilitates BVP feature disentanglement while improving optimization efficiency. To jointly enhance convergence stability and generalization performance, we further develop a unified optimization framework operating at the gradient, loss-landscape, and feature-representation levels. Specifically, a post-verification masking mechanism filters out misleading gradients according to the weak-amplitude physiological prior of BVP signals; a perturbation-based loss landscape smoothing strategy steers optimization toward more generalizable flat minima; and a noise-aware null-space regularization constrains feature updates to the orthogonal complement of the noise subspace, thereby mitigating noise-induced representation drift. Extensive experiments on five datasets demonstrate that FCUS-rPPG requires only one training epoch, whereas existing methods typically require tens to hundreds of epochs. Notably, FCUS-rPPG consistently achieves state-of-the-art (SOTA) performance in cross-dataset evaluations. This study provides an efficient and robust solution to the real-world deployment of unsupervised rPPG. The source code will be publicly available at https://github.com/JiaJieLee/FCUS-rPPG.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes FCUS-rPPG, an unsupervised remote photoplethysmography (rPPG) framework. Motivated by multi-spectral covariation and low-dimensional manifold structure in BVP representations, it introduces a spectrally shared backbone together with a unified optimization approach operating at gradient, loss-landscape, and feature levels via post-verification masking, perturbation-based smoothing, and noise-aware null-space regularization. The central empirical claims are that the method converges in a single training epoch (versus tens to hundreds for prior unsupervised methods) and attains state-of-the-art cross-dataset performance on five rPPG datasets.

Significance. If the one-epoch convergence and cross-dataset SOTA results are substantiated by the experiments, the work would offer a practically important advance for real-world unsupervised rPPG deployment by reducing training cost and improving generalization. The stated intention to release source code supports reproducibility.

major comments (1)
  1. [Abstract] Abstract: the central claims of one-epoch convergence and consistent SOTA cross-dataset performance are asserted without any quantitative numbers, baseline comparisons, ablation results, or error bars. Because these empirical outcomes are the load-bearing evidence for the contribution, their absence prevents evaluation of whether the proposed mechanisms deliver the stated gains.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for highlighting the need to strengthen the abstract with quantitative support for our central claims. We will revise the abstract to include specific metrics, baseline comparisons, and error bars drawn from the experimental results already reported in the manuscript.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claims of one-epoch convergence and consistent SOTA cross-dataset performance are asserted without any quantitative numbers, baseline comparisons, ablation results, or error bars. Because these empirical outcomes are the load-bearing evidence for the contribution, their absence prevents evaluation of whether the proposed mechanisms deliver the stated gains.

    Authors: We agree that the abstract would be more informative if it included concrete quantitative highlights. The full manuscript already contains tables and figures with epoch counts (one vs. tens-to-hundreds), cross-dataset MAE/RMSE/HR metrics against multiple baselines, and ablation studies. In the revision we will condense the key numbers (e.g., average MAE reduction, exact epoch comparison, and standard deviations) into the abstract while preserving its length constraints. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper introduces FCUS-rPPG as an unsupervised framework whose components (spectrally shared backbone, post-verification masking, perturbation-based smoothing, and null-space regularization) are motivated by stated empirical properties of BVP signals and implemented as distinct algorithmic mechanisms. The one-epoch convergence and cross-dataset SOTA claims are reported as outcomes of experiments across five datasets rather than quantities derived by construction from fitted parameters or prior self-citations. No load-bearing step reduces an output to an input via self-definition, renaming, or an unverified uniqueness theorem; the derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework rests on domain assumptions about the structure of BVP signals and their representations; no free parameters or invented entities are mentioned in the abstract.

axioms (2)
  • domain assumption BVP representations exhibit both multi-spectral covariation and low-dimensional manifold structure
    This observation directly motivates the design of the spectrally shared backbone.
  • domain assumption BVP signals have a weak-amplitude physiological prior
    This prior is used to filter misleading gradients via the post-verification masking mechanism.

pith-pipeline@v0.9.1-grok · 5852 in / 1383 out tokens · 35479 ms · 2026-06-28T11:09:38.221427+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

55 extracted references · 3 canonical work pages · 2 internal anchors

  1. [1]

    Artificial intelligence-enhanced electrocardiography in cardiovascular disease management,

    K. C. Siontis, P. A. Noseworthy, Z. I. Attia, and P. A. Friedman, “Artificial intelligence-enhanced electrocardiography in cardiovascular disease management,”Nat. Rev. Cardiol., vol. 18, no. 7, pp. 465–478, 2021

  2. [2]

    Ai- enabled electrocardiography alert intervention and all-cause mortality: a pragmatic randomized clinical trial,

    C. S. Lin, W. T. Liu, D. J. Tsai, C. H. Chen, C. W. Wanget al., “Ai- enabled electrocardiography alert intervention and all-cause mortality: a pragmatic randomized clinical trial,”Nat. Med., vol. 30, pp. 1461–1470, 2024

  3. [3]

    Wearable sensor data and self-reported symptoms for covid-19 detection,

    G. Quer, J. M. Radin, M. Gadaleta, K. Baca-Motes, L. Ariniello, E. Ramos, S. Kheterpal, E. J. Topol, and S. R. Steinhubl, “Wearable sensor data and self-reported symptoms for covid-19 detection,”Nat. Med., vol. 27, no. 1, pp. 73–77, 2021

  4. [4]

    Wearable sensors enable personalized predictions of clinical laboratory measurements,

    J. Dunn, L. Kidzinski, R. Runge, D. Witt, J. L. Hicks, S. M. Sch ¨ussler- Fiorenza Rose, X. Li, A. Bahmani, S. L. Delp, T. Hastieet al., “Wearable sensors enable personalized predictions of clinical laboratory measurements,”Nat. Med., vol. 27, no. 6, pp. 1105–1112, 2021

  5. [5]

    The role of acrylic acid impurity as a sensitizing com- ponent in electrocardiogram electrodes,

    L. Stingeni, E. Cerulli, A. Spalletti, A. Mazzoli, L. Rigano, L. Bianchi, and K. Hansel, “The role of acrylic acid impurity as a sensitizing com- ponent in electrocardiogram electrodes,”Contact Dermatitis, vol. 73, no. 1, pp. 44–48, 2015

  6. [6]

    End-to-end multimodal emotion recognition based on facial expressions and remote photoplethysmography signals,

    J. Li and J. Peng, “End-to-end multimodal emotion recognition based on facial expressions and remote photoplethysmography signals,”IEEE J. Biomed. Health Inform., vol. 28, no. 10, pp. 6054–6063, 2024

  7. [7]

    Remote plethysmo- graphic imaging using ambient light,

    W. Verkruysse, L. O. Svaasand, and J. S. Nelson, “Remote plethysmo- graphic imaging using ambient light,”Opt. Express, vol. 16, no. 26, pp. 21 434–21 445, 2008

  8. [8]

    Camera measurement of physiological vital signs,

    D. McDuff, “Camera measurement of physiological vital signs,”ACM Comput. Surv., vol. 55, no. 9, 2023

  9. [9]

    Video- based heart rate measurement: Recent advances and future prospects,

    X. Chen, J. Cheng, R. Song, Y . Liu, R. Ward, and Z. J. Wang, “Video- based heart rate measurement: Recent advances and future prospects,” IEEE Trans. Instrum. Meas., vol. 68, no. 10, pp. 3600–3615, 2019

  10. [10]

    Learning deep models for face anti- spoofing: Binary or auxiliary supervision,

    Y . Liu, A. Jourabloo, and X. Liu, “Learning deep models for face anti- spoofing: Binary or auxiliary supervision,” inProc. IEEE Conf. Comput. Vis. Pattern Recog., June 2018

  11. [11]

    Quality-based rPPG compensation with temporal difference transformer for camera-based driver monitoring,

    K. Lee, H. Seo, S. Kim, B. Seon An, S. Park, Y . Jeon, and E. C. Lee, “Quality-based rPPG compensation with temporal difference transformer for camera-based driver monitoring,”IEEE Trans. Intell. Transp. Syst., vol. 26, no. 2, pp. 1951–1963, 2025

  12. [12]

    Dichromatic reflection models for a variety of materials,

    S. Tominaga, “Dichromatic reflection models for a variety of materials,” Color Res. Appl., vol. 19, no. 4, pp. 277–285, 1994

  13. [13]

    Robust pulse rate from chrominance-based rPPG,

    G. De Haan and V . Jeanne, “Robust pulse rate from chrominance-based rPPG,”IEEE Trans. Biomed. Eng., vol. 60, no. 10, pp. 2878–2886, 2013

  14. [14]

    Algorithmic principles of remote PPG,

    W. Wang, A. C. Den Brinker, S. Stuijk, and G. De Haan, “Algorithmic principles of remote PPG,”IEEE Trans. Biomed. Eng., vol. 64, no. 7, pp. 1479–1491, 2016

  15. [15]

    Non-contact, automated cardiac pulse measurements using video imaging and blind source separation,

    M.-Z. Poh, D. J. McDuff, and R. W. Picard, “Non-contact, automated cardiac pulse measurements using video imaging and blind source separation,”Opt. Express, vol. 18, no. 10, pp. 10 762–10 774, 2010

  16. [16]

    DeepPhys: Video-based physiological mea- surement using convolutional attention networks,

    W. Chen and D. McDuff, “DeepPhys: Video-based physiological mea- surement using convolutional attention networks,” inProc. Eur. Conf. Comput. Vis., 2018, pp. 349–365. 14

  17. [17]

    Remote photoplethysmograph signal measurement from facial videos using spatio-temporal networks,

    Z. Yu, X. Li, and G. Zhao, “Remote photoplethysmograph signal measurement from facial videos using spatio-temporal networks,” in Proc. Brit. Mach. Vis. Conf., 2019, pp. 1–12

  18. [18]

    RhythmNet: End-to-end heart rate estimation from face via spatial-temporal representation,

    X. Niu, S. Shan, H. Han, and X. Chen, “RhythmNet: End-to-end heart rate estimation from face via spatial-temporal representation,”IEEE Trans. Image Process., vol. 29, pp. 2409–2423, 2019

  19. [19]

    PulseGAN: Learning to generate realistic pulse waveforms in remote photoplethys- mography,

    R. Song, H. Chen, J. Cheng, C. Li, Y . Liu, and X. Chen, “PulseGAN: Learning to generate realistic pulse waveforms in remote photoplethys- mography,”IEEE J. Biomed. Health Inform., vol. 25, no. 5, pp. 1373– 1384, 2021

  20. [20]

    PhysFormer: Facial video-based physiological measurement with temporal difference transformer,

    Z. Yu, Y . Shen, J. Shi, H. Zhao, P. H. Torr, and G. Zhao, “PhysFormer: Facial video-based physiological measurement with temporal difference transformer,” inProc. IEEE Conf. Comput. Vis. Pattern Recog., 2022, pp. 4186–4196

  21. [21]

    rPPG-MAE: Self- supervised pretraining with masked autoencoders for remote physiolog- ical measurements,

    X. Liu, Y . Zhang, Z. Yu, H. Lu, H. Yue, and J. Yang, “rPPG-MAE: Self- supervised pretraining with masked autoencoders for remote physiolog- ical measurements,”IEEE Trans. Multimedia, vol. 26, pp. 7278–7293, 2024

  22. [22]

    The way to my heart is through contrastive learning: Remote photoplethysmography from unlabelled video,

    J. Gideon and S. Stent, “The way to my heart is through contrastive learning: Remote photoplethysmography from unlabelled video,” in Proc. Int. Conf. Comput. Vis., 2021, pp. 3995–4004

  23. [23]

    Contrast-Phys: Unsupervised video-based remote physiological measurement via spatiotemporal contrast,

    Z. Sun and X. Li, “Contrast-Phys: Unsupervised video-based remote physiological measurement via spatiotemporal contrast,” inProc. Eur. Conf. Comput. Vis., 2022, pp. 492–510

  24. [24]

    Self-supervised representation learning framework for remote physiological measurement using spatiotemporal augmentation loss,

    H. Wang, E. Ahn, and J. Kim, “Self-supervised representation learning framework for remote physiological measurement using spatiotemporal augmentation loss,” inProc. AAAI Conf. Artif. Intell., vol. 36, no. 2, 2022, pp. 2431–2439

  25. [25]

    SimPer: Simple self-supervised learning of periodic targets,

    Y . Yang, X. Liu, J. Wu, S. Borac, D. Katabi, M.-Z. Poh, and D. McDuff, “SimPer: Simple self-supervised learning of periodic targets,”arXiv preprint arXiv:2210.03115, 2022

  26. [26]

    Facial video-based remote physiological measurement via self-supervised learning,

    Z. Yue, M. Shi, and S. Ding, “Facial video-based remote physiological measurement via self-supervised learning,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 11, pp. 13 844–13 859, 2023

  27. [27]

    Contrast-Phys+: Unsupervised and weakly-supervised video-based remote physiological measurement via spatiotemporal con- trast,

    Z. Sun and X. Li, “Contrast-Phys+: Unsupervised and weakly-supervised video-based remote physiological measurement via spatiotemporal con- trast,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 46, no. 8, pp. 5835– 5851, 2024

  28. [28]

    Self-supervised contrastive learning for robust remote photoplethysmography signal estimation based on spatio-temporal maps,

    J. Ma, R. Jia, M. Zhang, M. Jiang, and H. Sun, “Self-supervised contrastive learning for robust remote photoplethysmography signal estimation based on spatio-temporal maps,”Eng. Appl. Artif. Intell., vol. 165, p. 113501, 2026

  29. [29]

    Non-contrastive unsuper- vised learning of physiological signals from video,

    J. Speth, N. Vance, P. Flynn, and A. Czajka, “Non-contrastive unsuper- vised learning of physiological signals from video,” inProc. IEEE Conf. Comput. Vis. Pattern Recog., 2023, pp. 14 464–14 474

  30. [30]

    To remember, to adapt, to preempt: A stable continual test-time adaptation framework for remote physiological measurement in dynamic domain shifts,

    S. Chu, J. Shi, X. Cheng, H. Chen, X. Liu, J. Xu, and G. Zhao, “To remember, to adapt, to preempt: A stable continual test-time adaptation framework for remote physiological measurement in dynamic domain shifts,” inProc. ACM Int. Conf. Multimedia, 2025, pp. 7307–7316

  31. [31]

    Meta-rppg: Remote heart rate esti- mation using a transductive meta-learner,

    E. Lee, E. Chen, and C.-Y . Lee, “Meta-rppg: Remote heart rate esti- mation using a transductive meta-learner,” inProc. Eur. Conf. Comput. Vis., A. Vedaldi, H. Bischof, T. Brox, and J.-M. Frahm, Eds. Cham: Springer International Publishing, 2020, pp. 392–409

  32. [32]

    Swad: Domain generalization by seeking flat minima,

    J. Cha, S. Chun, K. Lee, H.-C. Cho, S. Park, Y . Lee, and S. Park, “Swad: Domain generalization by seeking flat minima,” inProc. Adv. Neural Inform. Process. Syst., M. Ranzato, A. Beygelzimer, Y . Dauphin, P. Liang, and J. W. Vaughan, Eds., vol. 34. Curran Associates, Inc., 2021, pp. 22 405–22 418

  33. [33]

    Improved motion robustness of remote- PPG by using the blood volume pulse signature,

    G. de Haan and A. van Leest, “Improved motion robustness of remote- PPG by using the blood volume pulse signature,”Physiol. Meas., vol. 35, no. 9, p. 1913, 2014

  34. [34]

    Multi-task temporal shift attention networks for on-device contactless vitals measurement,

    X. Liu, J. Fromm, S. Patel, and D. McDuff, “Multi-task temporal shift attention networks for on-device contactless vitals measurement,” in Proc. Adv. Neural Inform. Process. Syst., vol. 33, 2020, pp. 19 400– 19 411

  35. [35]

    Neuron structure modeling for generalizable remote physiological measurement,

    H. Lu, Z. Yu, X. Niu, and Y .-C. Chen, “Neuron structure modeling for generalizable remote physiological measurement,” inProc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2023, pp. 18 589–18 599

  36. [36]

    Robust remote photoplethysmography esti- mation with environmental noise disentanglement,

    S.-Q. Liu and P. C. Yuen, “Robust remote photoplethysmography esti- mation with environmental noise disentanglement,”IEEE Trans. Image Process., vol. 33, pp. 27–41, 2024

  37. [37]

    Advancing generalizable remote physiological measurement through the integration of explicit and implicit prior knowledge,

    Y . Zhang, H. Lu, X. Liu, Y . Chen, and K. Wu, “Advancing generalizable remote physiological measurement through the integration of explicit and implicit prior knowledge,”IEEE Trans. Image Process., vol. 34, pp. 3764–3778, 2025

  38. [38]

    Dimensionality reduction by learning an invariant mapping,

    R. Hadsell, S. Chopra, and Y . LeCun, “Dimensionality reduction by learning an invariant mapping,” inProc. IEEE Conf. Comput. Vis. Pattern Recog., vol. 2, 2006, pp. 1735–1742

  39. [39]

    Unsupervised feature learn- ing via non-parametric instance discrimination,

    Z. Wu, Y . Xiong, S. X. Yu, and D. Lin, “Unsupervised feature learn- ing via non-parametric instance discrimination,” inProc. IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 3733–3742

  40. [40]

    Representation Learning with Contrastive Predictive Coding

    A. v. d. Oord, Y . Li, and O. Vinyals, “Representation learning with contrastive predictive coding,”arXiv preprint arXiv:1807.03748, 2018

  41. [41]

    Masked autoencoders are scalable vision learners,

    K. He, X. Chen, S. Xie, Y . Li, P. Doll ´ar, and R. Girshick, “Masked autoencoders are scalable vision learners,” inProc. IEEE Conf. Comput. Vis. Pattern Recog., June 2022, pp. 16 000–16 009

  42. [42]

    Uda-rPPG: Unsupervised geometric-physiological domain anchoring for low-light rPPG measure- ment,

    J. Wu, X. Cheng, Y . Jiang, Z. Sun, and X. Li, “Uda-rPPG: Unsupervised geometric-physiological domain anchoring for low-light rPPG measure- ment,”IEEE Trans. Circuits Syst. Video Technol., vol. 36, no. 3, pp. 3951–3963, 2026

  43. [43]

    MediaPipe: A Framework for Building Perception Pipelines

    C. Lugaresi, J. Tang, H. Nash, C. McClanahan, E. Uboweja, M. Hays, F. Zhang, C.-L. Chang, M. G. Yong, J. Leeet al., “MediaPipe: A framework for building perception pipelines,”arXiv preprint arXiv:1906.08172, 2019

  44. [44]

    Video-based remote physiological measurement via cross-verified feature disentangling,

    X. Niu, Z. Yu, H. Han, X. Li, S. Shan, and G. Zhao, “Video-based remote physiological measurement via cross-verified feature disentangling,” in Proc. Eur. Conf. Comput. Vis., 2020, pp. 295–310

  45. [45]

    Amplitude- selective filtering for remote-PPG,

    W. Wang, A. C. den Brinker, S. Stuijk, and G. de Haan, “Amplitude- selective filtering for remote-PPG,”Biomed. Opt. Express, vol. 8, no. 3, pp. 1965–1980, 2017

  46. [46]

    Un- supervised skin tissue segmentation for remote photoplethysmography,

    S. Bobbia, R. Macwan, Y . Benezeth, A. Mansouri, and J. Dubois, “Un- supervised skin tissue segmentation for remote photoplethysmography,” Pattern Recognit. Lett., vol. 124, pp. 82–90, 2019

  47. [47]

    Non-contact video-based pulse rate measurement on a mobile service robot,

    R. Stricker, S. M ¨uller, and H.-M. Gross, “Non-contact video-based pulse rate measurement on a mobile service robot,” inProc. IEEE Int. Workshop Robot Human Interact. Commun., 2014, pp. 1056–1062

  48. [48]

    Motion-robust remote photoplethysmography with time–frequency wiener filtering,

    J. Bian, J. Cheng, C. Li, H. Shi, X. Yang, and R. Song, “Motion-robust remote photoplethysmography with time–frequency wiener filtering,” IEEE Sensors J., vol. 25, no. 24, pp. 44 417–44 427, 2025

  49. [49]

    MMPD: Multi-domain mobile video physiology dataset,

    J. Tang, K. Chen, Y . Wang, Y . Shi, S. Patel, D. McDuff, and X. Liu, “MMPD: Multi-domain mobile video physiology dataset,” inProc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc., 2023, pp. 1–5

  50. [50]

    Visual heart rate estimation with convolutional neural network,

    R. ˇSpetl´ık, V . Franc, and J. Matas, “Visual heart rate estimation with convolutional neural network,” inProc. Brit. Mach. Vis. Conf., 2018, pp. 3–6

  51. [51]

    Unifying frame rate and temporal dilations for improved remote pulse detection,

    J. Speth, N. Vance, P. Flynn, K. Bowyer, and A. Czajka, “Unifying frame rate and temporal dilations for improved remote pulse detection,” Comput. Vis. Image Underst., vol. 210, p. 103246, 2021

  52. [52]

    Efficientphys: Enabling simple, fast and accurate camera-based cardiac measurement,

    X. Liu, B. Hill, Z. Jiang, S. Patel, and D. McDuff, “Efficientphys: Enabling simple, fast and accurate camera-based cardiac measurement,” inProc. IEEE/CVF Winter Conf. Appl. Comput. Vis., January 2023, pp. 5008–5017

  53. [53]

    Dual-GAN: Joint BVP and noise modeling for remote physiological measurement,

    H. Lu, H. Han, and S. K. Zhou, “Dual-GAN: Joint BVP and noise modeling for remote physiological measurement,” inProc. IEEE Conf. Comput. Vis. Pattern Recog., 2021, pp. 12 404–12 413

  54. [54]

    Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,

    K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” in Proc. Int. Conf. Comput. Vis., 2015, pp. 1026–1034

  55. [55]

    rPPG-Toolbox: Deep remote PPG toolbox,

    X. Liu, G. Narayanswamy, A. Paruchuri, X. Zhang, J. Tang, Y . Zhang, R. Sengupta, S. Patel, Y . Wang, and D. McDuff, “rPPG-Toolbox: Deep remote PPG toolbox,” inProc. Adv. Neural Inform. Process. Syst., A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, Eds., vol. 36, 2023, pp. 68 485–68 510