pith. sign in

arxiv: 2605.28687 · v1 · pith:6IEZFIX2new · submitted 2026-05-27 · 💻 cs.SD · physics.med-ph

Cross-modal characterization of infant cry: validation of a chest-surface accelerometer in extracting acoustic vocal function measures

Pith reviewed 2026-06-29 09:50 UTC · model grok-4.3

classification 💻 cs.SD physics.med-ph
keywords infant cryaccelerometeracoustic analysisfundamental frequencyjittervalidation studyneurodevelopmentvocal function
0
0 comments X

The pith

A chest-surface accelerometer extracts fundamental frequency and jitter from infant cries with excellent agreement to microphone signals.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether vibrations captured by a chest-mounted accelerometer can substitute for microphone recordings when measuring vocal properties in infant cries. It records both devices simultaneously from 85 infants during vaccinations and extracts seven acoustic measures, finding strong statistical agreement on fundamental frequency and jitter but lower agreement and bias on shimmer and harmonics-to-noise ratio. Infant cry acoustics are viewed as potential early indicators of neurodevelopment, yet microphones suffer from environmental noise and privacy problems in real clinics. Establishing the accelerometer's reliability on key measures would allow quieter, more private data collection for larger developmental studies.

Core claim

Chest-surface accelerometers can reliably capture several clinically relevant acoustic features of infant cry, particularly temporal measures of F0 and jitter, as shown by intraclass correlation coefficients exceeding 0.94 for F0 and good-to-excellent values for jitter when compared to simultaneous microphone recordings; shimmer and HNR display systematic differences attributable to signal transmission and noise sensitivity.

What carries the argument

Intraclass correlation coefficients comparing acoustic vocal function measures (F0, jitter, shimmer, CPP, HNR) extracted from simultaneous chest-surface accelerometer and microphone signals in 85 infants.

If this is right

  • Accelerometer recordings can be used in noisy clinical environments where microphones are impractical.
  • The method supports privacy-preserving collection of infant vocal data.
  • Temporal measures such as F0 and jitter become feasible targets for scalable developmental research.
  • The approach can be applied during routine visits such as vaccinations without additional equipment burden.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Recordings could move from clinic-only to home settings for longitudinal tracking.
  • The same sensor might be tested against actual neurodevelopmental diagnoses to strengthen clinical claims.
  • Differences in shimmer and HNR suggest the accelerometer may be less sensitive to certain voice quality aspects that microphones capture.

Load-bearing premise

Statistical agreement between the two recording methods on selected measures is sufficient to establish clinical validity of the accelerometer approach.

What would settle it

A direct comparison showing that accelerometer-derived F0 and jitter values fail to predict the same developmental or diagnostic outcomes as microphone-derived values in the same infants.

Figures

Figures reproduced from arXiv: 2605.28687 by Carol L. Wilkinson, Daryush D. Mehta, Lisa Yankowitz, Saketh Sundar, Winko W. An.

Figure 1
Figure 1. Figure 1: (a) Illustration of sensor attachment and equipment used in the experiment. (b) Close-up photo of the accelerometer placement on an infant. [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: (a) Raw microphone (MIC; blue) and accelerometer (ACC; orange) waveforms from a single recording containing cry and non-cry segments. The [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Graphical agreement in F0 between microphone (MIC) and accelerometer (ACC) signals from each 50-ms segment. (a) A 2-D histogram illustrating [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Two-dimensional histograms visualizing agreement in vocal function measures between microphone (MIC) and accelerometer (ACC) signals from each [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Histograms showing distribution of vocal function values extracted in microphone (MIC) and accelerometer (ACC) signals from each 50-ms segment [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
read the original abstract

Background: Infant cry acoustics provide a promising window into early neurodevelopment and may serve as scalable biomarkers for neurodevelopmental disorders. However, conventional microphone-based recordings are highly susceptible to environmental noise and raise privacy concerns in real-world clinical settings. Chest-surface accelerometers may offer a robust alternative by capturing vibrations directly from the larynx. Methods: We evaluated the validity of a chest-mounted accelerometer (ACC) for infant cry analysis by comparing acoustic features derived from ACC and simultaneously recorded microphone (MIC) signals during routine vaccination visits. The final sample included 85 infants (41 at 4 months; 44 at 12 months) from a diverse pediatric population. Seven vocal measures were extracted from both modalities, including fundamental frequency (F0), jitter, shimmer, cepstral peak prominence (CPP), and harmonics-to-noise ratio (HNR). Agreement and consistency between modalities was assessed using intraclass correlation coefficients (ICCs). Results: F0 demonstrated excellent agreement between ACC and MIC recordings (ICC > 0.94). Jitter measures also showed good-to-excellent agreement, while CPP demonstrated moderate agreement. Shimmer and HNR showed lower absolute agreement and systematic bias between modalities, reflecting possible differences in signal transmission and noise sensitivity. Conclusion: In summary, chest-surface accelerometers can reliably capture several clinically relevant acoustic features of infant cry, particularly temporal measures of F0 and jitter. This approach offers a noise-robust and privacy-preserving alternative to microphone-based recordings, supporting its potential use in scalable clinical and developmental research applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript reports an empirical validation of chest-surface accelerometers (ACC) versus simultaneous microphone (MIC) recordings for extracting seven acoustic vocal function measures (F0, jitter, shimmer, CPP, HNR and others) from infant cries. In a sample of 85 infants (41 at 4 months, 44 at 12 months), it computes intraclass correlation coefficients and finds excellent agreement for F0 (ICC > 0.94), good-to-excellent agreement for jitter measures, moderate agreement for CPP, and lower absolute agreement with systematic bias for shimmer and HNR. The conclusion is that ACC signals can reliably capture clinically relevant temporal features such as F0 and jitter, offering a noise-robust and privacy-preserving alternative to microphone recordings.

Significance. If the reported ICC agreements hold after full methodological scrutiny, the work supplies concrete evidence that a contact sensor can extract selected acoustic parameters from infant vocalizations with high consistency to the conventional reference. This directly supports the feasibility of scalable, less noise-sensitive data collection in clinical and developmental settings without requiring outcome correlation for the core extraction-validity claim.

minor comments (3)
  1. Abstract: The methods paragraph omits any mention of signal-processing steps, exclusion criteria, or power analysis, even though the results section reports concrete ICC thresholds; this reduces the standalone informativeness of the abstract.
  2. Results: The text notes systematic bias in shimmer and HNR but does not quantify the magnitude of the bias (e.g., mean difference or limits of agreement) alongside the ICC values; adding Bland-Altman statistics would strengthen the interpretation of lower-agreement measures.
  3. Discussion: The claim that the approach supports 'scalable clinical and developmental research applications' is stated without reference to any developmental-outcome data; a brief qualification that the present study addresses only feature extraction (not predictive validity) would prevent overstatement.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of our manuscript and the recommendation for minor revision. No specific major comments were raised.

Circularity Check

0 steps flagged

No significant circularity; purely empirical validation

full rationale

The paper reports a direct empirical comparison of vocal features (F0, jitter, etc.) extracted from simultaneous accelerometer and microphone recordings in 85 infants, quantified via standard ICC metrics. No equations, parameter fits, predictions, or derivations are present that could reduce to inputs by construction. No load-bearing self-citations or uniqueness claims appear in the provided text. The central validity claim rests on observed statistical agreement against an external reference signal, which is independent of the paper's own results.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper relies on standard statistical agreement metrics and domain assumptions about signal equivalence without introducing new parameters or entities.

axioms (1)
  • domain assumption Intraclass correlation coefficient is an appropriate and sufficient metric for validating equivalence of vocal features between accelerometer and microphone modalities.
    Used as the primary evidence of agreement in the results.

pith-pipeline@v0.9.1-grok · 5829 in / 1138 out tokens · 33266 ms · 2026-06-29T09:50:18.602338+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

32 extracted references · 30 canonical work pages

  1. [1]

    Understanding the neural mechanisms involved in sensory control of voice production,

    A. L. Parkinson, S. G. Flagmeier, J. L. Manes, C. R. Larson, B. Rogers, and D. A. Robin, “Understanding the neural mechanisms involved in sensory control of voice production,”NeuroImage, vol. 61, no. 1, pp. 314–322, May 2012, doi: 10.1016/j.neuroimage.2012.02.068

  2. [2]

    Neural circuits underlying crying and cry responding in mammals,

    J. D. Newman, “Neural circuits underlying crying and cry responding in mammals,”Behav. Brain Res., vol. 182, no. 2, pp. 155–165, Sep. 2007, doi: 10.1016/j.bbr.2007.02.011

  3. [3]

    Fundamen- tal frequency variation of neonatal spontaneous crying predicts language acquisition in preterm and term infants,

    Y . Shinya, M. Kawai, F. Niwa, M. Imafuku, and M. Myowa, “Fundamen- tal frequency variation of neonatal spontaneous crying predicts language acquisition in preterm and term infants,”Front. Psychol., vol. 8, Dec. 2017, Art. no. 2195, doi: 10.3389/fpsyg.2017.02195

  4. [4]

    Acoustic cry characteristics in preterm infants and developmental and behavioral outcomes at 2 years of age,

    A. W. Manigaultet al., “Acoustic cry characteristics in preterm infants and developmental and behavioral outcomes at 2 years of age,”JAMA Netw. Open, vol. 6, no. 2, Feb. 2023, Art. no. e2254151, doi: 10.1001/ja- manetworkopen.2022.54151

  5. [5]

    Cry, baby, cry: Expression of distress as a biomarker and modulator in autism spectrum disorder,

    G. Esposito, N. Hiroi, and M. L. Scattoni, “Cry, baby, cry: Expression of distress as a biomarker and modulator in autism spectrum disorder,” Int. J. Neuropsychopharmacol., vol. 20, no. 6, pp. 498–503, Jun. 2017, doi: 10.1093/ijnp/pyx014

  6. [6]

    Perception of cry characteristics in 1-month-old infants later diagnosed with autism spectrum disorder,

    M. S. English, E. J. Tenenbaum, T. P. Levine, B. M. Lester, and S. J. Sheinkopf, “Perception of cry characteristics in 1-month-old infants later diagnosed with autism spectrum disorder,”J. Autism Dev. Disord., vol. 49, no. 3, pp. 834–844, Mar. 2019, doi: 10.1007/s10803-018-3788- 2

  7. [7]

    Brief report: Atypical expression of distress during the separation phase of the strange situation procedure in infant siblings at high risk for ASD,

    G. Esposito, M. del C. Rostagno, P. Venuti, J. D. Haltigan, and D. S. Messinger, “Brief report: Atypical expression of distress during the separation phase of the strange situation procedure in infant siblings at high risk for ASD,”J. Autism Dev. Disord., vol. 44, no. 4, pp. 975–980, Apr. 2014, doi: 10.1007/s10803-013-1940-6

  8. [8]

    Atyp- ical cry acoustics in 6-month-old infants at risk for autism spectrum disorder,

    S. J. Sheinkopf, J. M. Iverson, M. L. Rinaldi, and B. M. Lester, “Atyp- ical cry acoustics in 6-month-old infants at risk for autism spectrum disorder,”Autism Res., vol. 5, no. 5, pp. 331–339, Oct. 2012, doi: 10.1002/aur.1244

  9. [9]

    Developmental changes in the fundamental frequency (f0) of infants’ cries: A study of children with autism spectrum disorder,

    G. Esposito and P. Venuti, “Developmental changes in the fundamental frequency (f0) of infants’ cries: A study of children with autism spectrum disorder,”Early Child Dev. Care, vol. 180, no. 8, pp. 1093–1102, Sep. 2010, doi: 10.1080/03004430902775633

  10. [10]

    Automatic newborn cry analysis: A non-invasive tool to help autism early diagno- sis,

    S. Orlandi, C. Manfredi, L. Bocchi, and M. L. Scattoni, “Automatic newborn cry analysis: A non-invasive tool to help autism early diagno- sis,” inProc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBC), Aug. 2012, pp. 2953–2956, doi: 10.1109/EMBC.2012.6346583

  11. [11]

    Acoustic properties of cries in 12-month-old infants at high-risk of autism spectrum disorder,

    L. M. Unwinet al., “Acoustic properties of cries in 12-month-old infants at high-risk of autism spectrum disorder,”J. Autism Dev. Disord., vol. 47, no. 7, pp. 2108–2119, Jul. 2017, doi: 10.1007/s10803-017-3119-z

  12. [12]

    Very early detection of autism spectrum disorders based on acoustic analysis of pre-verbal vocalizations of 18-month old toddlers,

    J. F. Santoset al., “Very early detection of autism spectrum disorders based on acoustic analysis of pre-verbal vocalizations of 18-month old toddlers,” inProc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), May 2013, pp. 7567–7571, doi: 10.1109/ICASSP.2013.6639134

  13. [13]

    Acoustic analysis of vocal dys- phonia,

    J. P. Teixeira and P. O. Fernandes, “Acoustic analysis of vocal dys- phonia,”Procedia Comput. Sci., vol. 64, pp. 466–473, Jan. 2015, doi: 10.1016/j.procs.2015.08.544

  14. [14]

    How can cry acoustics associate newborns’ distress levels with neurophysiological and behavioral signals?,

    A. Lagunaet al., “How can cry acoustics associate newborns’ distress levels with neurophysiological and behavioral signals?,” Front. Neurosci., vol. 17, Sep. 2023, Art. no. 1266873, doi: 10.3389/fnins.2023.1266873

  15. [15]

    Early identification of autism using cry analysis: A systematic review and meta-analysis of retrospective and prospective studies,

    S. Pusil, A. Laguna, B. Chino, J. A. Zegarra, and S. Orlandi, “Early identification of autism using cry analysis: A systematic review and meta-analysis of retrospective and prospective studies,”J. Autism Dev. Disord., Mar. 2025, doi: 10.1007/s10803-025-06757-4

  16. [16]

    A review of infant cry analysis and classification,

    C. Ji, T. B. Mudiyanselage, Y . Gao, and Y . Pan, “A review of infant cry analysis and classification,”EURASIP J. Audio Speech Music Process., vol. 2021, no. 1, Feb. 2021, Art. no. 8, doi: 10.1186/s13636-021-00197- 5

  17. [17]

    Reidentification of participants in shared clinical data sets: Experimental study,

    D. Wiepertet al., “Reidentification of participants in shared clinical data sets: Experimental study,”JMIR AI, vol. 3, Mar. 2024, Art. no. e52054, doi: 10.2196/52054

  18. [18]

    Ambulatory monitoring of subglottal pressure estimated from neck-surface vibration in individuals with and without voice disorders,

    J. P. Cort ´eset al., “Ambulatory monitoring of subglottal pressure estimated from neck-surface vibration in individuals with and without voice disorders,”Appl. Sci., vol. 12, no. 21, Jan. 2022, Art. no. 10692, doi: 10.3390/app122110692

  19. [19]

    Mobile voice health monitoring using a wearable accelerometer sensor and a smartphone platform,

    D. D. Mehta, M. Za ˜nartu, S. W. Feng, H. A. Cheyne, and R. E. Hillman, “Mobile voice health monitoring using a wearable accelerometer sensor and a smartphone platform,”IEEE Trans. Biomed. Eng., vol. 59, no. 11, pp. 3090–3096, Nov. 2012, doi: 10.1109/TBME.2012.2207896

  20. [20]

    Relationships between vocal function measures derived from an acoustic microphone and a subglottal neck-surface accelerometer,

    D. D. Mehta, J. H. Van Stan, and R. E. Hillman, “Relationships between vocal function measures derived from an acoustic microphone and a subglottal neck-surface accelerometer,”IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 24, no. 4, pp. 659–668, Apr. 2016, doi: 10.1109/TASLP.2016.2516647

  21. [21]

    Exploring agreement in voice acoustic parameters: A repeated measures case study across varied recording instruments, speech samples, and daily timeframes,

    L. C. Cantor-Cutiva, A. Castillo-Allendes, and E. J. Hunter, “Exploring agreement in voice acoustic parameters: A repeated measures case study across varied recording instruments, speech samples, and daily timeframes,”Acoustics, vol. 7, no. 1, Mar. 2025, Art. no. 6, doi: 10.3390/acoustics7010006

  22. [22]

    Praat: Doing phonetics by computer,

    P. Boersma and D. Weenink, “Praat: Doing phonetics by computer,”

  23. [23]

    Available: https://www.fon.hum.uva.nl/praat/

    [Online]. Available: https://www.fon.hum.uva.nl/praat/

  24. [24]

    Introducing Parselmouth: A Python interface to Praat,

    Y . Jadoul, B. Thompson, and B. De Boer, “Introducing Parselmouth: A Python interface to Praat,”J. Phon., vol. 71, pp. 1–15, Nov. 2018, doi: 10.1016/j.wocn.2018.07.001

  25. [25]

    R. R. Patelet al., “Recommended protocols for instrumental assessment of voice: American Speech-Language-Hearing Association expert panel to develop a protocol for instrumental assessment of vocal function,” Amer . J. Speech-Lang. Pathol., vol. 27, no. 3, pp. 887–905, Aug. 2018, doi: 10.1044/2018 AJSLP-17-0009

  26. [26]

    Cepstral peak prominence values for clinical voice evaluation,

    O. Murton, R. Hillman, and D. Mehta, “Cepstral peak prominence values for clinical voice evaluation,”Amer . J. Speech-Lang. Pathol., vol. 29, no. 3, pp. 1596–1607, Aug. 2020, doi: 10.1044/2020 AJSLP-20-00001

  27. [27]

    Psychological Bulletin 86, 420–428

    P. E. Shrout and J. L. Fleiss, “Intraclass correlations: Uses in assessing rater reliability,”Psychol. Bull., vol. 86, no. 2, pp. 420–428, 1979, doi: 10.1037/0033-2909.86.2.420

  28. [28]

    McGraw and S

    K. O. McGraw and S. P. Wong, “Forming inferences about some intraclass correlation coefficients,”Psychol. Methods, vol. 1, no. 1, pp. 30–46, 1996, doi: 10.1037/1082-989X.1.1.30

  29. [29]

    Assessment of infant cry: Acoustic cry analysis and parental perception,

    L. L. LaGasse, A. R. Neal, and B. M. Lester, “Assessment of infant cry: Acoustic cry analysis and parental perception,”Ment. Retard. Dev. Dis- abil. Res. Rev., vol. 11, no. 1, pp. 83–93, 2005, doi: 10.1002/mrdd.20050

  30. [30]

    Koo and Mae Y

    T. K. Koo and M. Y . Li, “A guideline of selecting and reporting intraclass correlation coefficients for reliability research,”J. Chiropr . Med., vol. 15, no. 2, pp. 155–163, Jun. 2016, doi: 10.1016/j.jcm.2016.02.012

  31. [31]

    The maternal lifestyle study: Effects of substance exposure during pregnancy on neurodevelopmental outcome in 1-month- old infants,

    B. M. Lesteret al., “The maternal lifestyle study: Effects of substance exposure during pregnancy on neurodevelopmental outcome in 1-month- old infants,”Pediatrics, vol. 110, no. 6, pp. 1182–1192, Dec. 2002, doi: 10.1542/peds.110.6.1182

  32. [32]

    Air-borne and tissue-borne sensitivities of bioacoustic sensors used on the skin surface,

    M. Za ˜nartu, J. C. Ho, S. S. Kraman, H. Pasterkamp, J. E. Huber, and G. R. Wodicka, “Air-borne and tissue-borne sensitivities of bioacoustic sensors used on the skin surface,”IEEE Trans. Biomed. Eng., vol. 56, no. 2, pp. 443–451, Feb. 2009, doi: 10.1109/TBME.2008.2008165