Looking and Listening Inside and Outside: Multimodal Artificial Intelligence Systems for Driver Safety Assessment and Intelligent Vehicle Decision-Making
Pith reviewed 2026-05-16 05:53 UTC · model grok-4.3
The pith
Adding audio signals to visual sensing forms the L-LIO framework that improves driver state assessment and vehicle environment understanding through multimodal fusion.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that expanding the looking-in-looking-out framework with audio signals produces the looking-and-listening inside-and-outside framework, which strengthens driver state assessment and environment understanding via multimodal sensor fusion, as shown in pilot evaluations of speech-based impairment classification, natural-language passenger instructions, and audio disambiguation of external guidance.
What carries the argument
The L-LIO framework, which fuses audio and visual signals collected inside and outside the vehicle to support safety-relevant decisions.
If this is right
- Supervised models trained on driver speech audio can classify states such as intoxication.
- Passenger spoken instructions can be collected and aligned to guide vehicle planning systems.
- Audio cues can resolve ambiguities in external agents' gestures and guidance that vision-only systems miss.
- Multimodal fusion of audio and visual data opens new paths for safety interventions in intelligent vehicles.
Where Pith is reading between the lines
- The approach could extend to passenger monitoring and external pedestrian interaction in shared autonomy settings.
- Systems built on L-LIO may require separate privacy-preserving audio processing pipelines to reach widespread use.
- Real-time fusion algorithms would need testing against varying cabin and road noise profiles before deployment.
- The framework suggests a general pattern for adding sound-based channels to other vision-centric vehicle perception tasks.
Load-bearing premise
Audio signals can supply reliable safety insights in real-world noisy conditions without major interference, privacy conflicts, or loss of performance across different people.
What would settle it
A controlled test showing that audio-based classification of driver impairment or external guidance performs at chance level in typical driving noise levels or across varied speakers.
read the original abstract
The looking-in-looking-out (LILO) framework has enabled intelligent vehicle applications that understand both the outside scene and the driver state to improve safety outcomes, with examples in smart airbag deployment, takeover time prediction in autonomous control transitions, and driver attention monitoring. In this research, we propose an augmentation to this framework, making a case for the audio modality as an additional source of information to understand the driver, and in the evolving autonomy landscape, also the passengers and those outside the vehicle. We expand LILO by incorporating audio signals, forming the looking-and-listening inside-and-outside (L-LIO) framework to enhance driver state assessment and environment understanding through multimodal sensor fusion. We evaluate three example cases where audio enhances vehicle safety: supervised learning on driver speech audio to classify potential impairment states (e.g., intoxication), collection and analysis of passenger natural language instructions (e.g., "turn after that red building") to motivate how spoken language can interface with planning systems through audio-aligned instruction data, and limitations of vision-only systems where audio may disambiguate the guidance and gestures of external agents. Datasets include custom-collected in-vehicle and external audio samples in real-world environments. Pilot findings show that audio yields safety-relevant insights, particularly in nuanced or context-rich scenarios where sound is critical to safe decision-making or visual signals alone are insufficient. Challenges include ambient noise interference, privacy considerations, and robustness across human subjects, motivating further work on reliability in dynamic real-world contexts. L-LIO augments driver and scene understanding through multimodal fusion of audio and visual sensing, offering new paths for safety intervention.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes extending the looking-in-looking-out (LILO) framework to a looking-and-listening inside-and-outside (L-LIO) framework by adding audio signals for multimodal fusion. It claims this enhances driver state assessment (e.g., speech impairment classification), passenger natural language instructions for planning, and disambiguation of external agents, supported by qualitative pilot findings on custom in-vehicle and external audio datasets collected in real-world environments.
Significance. If the pilot cases were supported by quantitative metrics and baselines, L-LIO could meaningfully advance multimodal safety systems in intelligent vehicles by addressing vision-only limitations in context-rich scenarios. As presented, the conceptual framing is clear but the absence of verifiable results limits its contribution to the literature on driver monitoring and autonomous decision-making.
major comments (2)
- [Evaluation of example cases] The three example cases in the evaluation section are described only qualitatively (driver speech impairment, passenger instructions such as 'turn after that red building', and audio disambiguation of external agents) with no reported accuracies, F1 scores, success rates, dataset sizes, collection protocols, model architectures, or comparisons to vision-only baselines. This directly undermines the central claim that audio 'yields safety-relevant insights' and 'enhances' assessment.
- [Challenges and future work] The abstract and challenges paragraph mention ambient noise interference, privacy considerations, and robustness across subjects as open issues, yet no experiments, noise-handling methods, or subject-variability tests are provided to assess whether audio can reliably augment safety in dynamic real-world conditions.
minor comments (2)
- [Abstract] The abstract states 'pilot findings show that audio yields safety-relevant insights' without enumerating what those specific findings are or referencing any supporting table or figure.
- [Framework description] Notation for the proposed L-LIO framework is introduced at a high level but without a diagram or pseudocode clarifying the multimodal fusion architecture relative to the original LILO.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript proposing the L-LIO framework. We address each major comment below and outline the revisions we will make to better scope our claims and clarify the preliminary nature of the work.
read point-by-point responses
-
Referee: [Evaluation of example cases] The three example cases in the evaluation section are described only qualitatively (driver speech impairment, passenger instructions such as 'turn after that red building', and audio disambiguation of external agents) with no reported accuracies, F1 scores, success rates, dataset sizes, collection protocols, model architectures, or comparisons to vision-only baselines. This directly undermines the central claim that audio 'yields safety-relevant insights' and 'enhances' assessment.
Authors: We appreciate this observation and agree that the evaluations are qualitative. The manuscript is positioned as a conceptual proposal for extending LILO to L-LIO, using three illustrative pilot cases collected in real-world environments to motivate the framework rather than to deliver a full empirical study with benchmarks. Dataset sizes and collection protocols for the custom in-vehicle and external audio samples are described in the text, but no quantitative metrics or vision-only baselines were computed. We will revise the evaluation section to explicitly frame these as preliminary qualitative examples, tone down claims of enhancement to 'potential' insights, and state that quantitative comparisons are reserved for future work. This will align the presentation with the evidence provided. revision: partial
-
Referee: [Challenges and future work] The abstract and challenges paragraph mention ambient noise interference, privacy considerations, and robustness across subjects as open issues, yet no experiments, noise-handling methods, or subject-variability tests are provided to assess whether audio can reliably augment safety in dynamic real-world conditions.
Authors: We agree that the challenges are presented without accompanying experiments or methods. As the manuscript focuses on introducing the multimodal framework and motivating its use via pilots, detailed validation of noise robustness or subject variability falls outside the current scope. We will revise the abstract and challenges paragraph to more clearly label these as open issues for future research and briefly outline example directions, such as adaptive filtering for noise or multi-subject data collection protocols, without claiming any current solutions. revision: yes
Circularity Check
No circularity: conceptual framework proposal without derivations or self-referential reductions
full rationale
The manuscript proposes the L-LIO framework as a multimodal extension of the existing LILO framework by adding audio signals for driver state assessment and scene understanding. No equations, parameters, or quantitative predictions appear anywhere in the text. The three pilot cases (speech impairment classification, passenger instructions, and external agent disambiguation) are described qualitatively with no fitted models, no performance metrics, and no claims that a derived quantity equals an input by construction. LILO is referenced as prior work but is not used to justify any uniqueness theorem or ansatz within this paper; the central claim remains a high-level suggestion for sensor fusion rather than a derived result. The absence of any load-bearing mathematical or predictive step means the derivation chain is empty and the proposal is self-contained.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Audio signals provide safety-relevant information not captured by vision alone in driver and scene understanding tasks.
invented entities (1)
-
L-LIO framework
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Opensmile: the munich versatile and fast open-source audio feature extractor,
F. Eyben, M. Wöllmer, and B. Schuller, “Opensmile: the munich versatile and fast open-source audio feature extractor,” in Proceedings of the 18th ACM international conference on Multimedia, 2010, pp. 1459–1462
work page 2010
-
[2]
The geneva minimalistic acoustic parameter set (gemaps) for voice research and affective computing,
F. Eyben, K. R. Scherer, B. W. Schuller, J. Sundberg, E. André, C. Busso, L. Y. Devillers, J. Epps, P. Laukka, S. S. Narayanan et al., “The geneva minimalistic acoustic parameter set (gemaps) for voice research and affective computing,” IEEE transactions on affective computing, vol. 7, no. 2, pp. 190–202, 2015
work page 2015
-
[3]
wav2vec 2.0: A framework for self-supervised learning of speech representations,
A. Baevski, Y. Zhou, A. Mohamed, and M. Auli, “wav2vec 2.0: A framework for self-supervised learning of speech representations,” Advances in neural information processing systems, vol. 33, pp. 12 449–12 460, 2020
work page 2020
-
[4]
Wavlm: Large-scale self-supervised pre-training for full stack speech processing,
S. Chen, C. Wang, Z. Chen, Y. Wu, S. Liu, Z. Chen, J. Li, N. Kanda, T. Yoshioka, X. Xiao et al., “Wavlm: Large-scale self-supervised pre-training for full stack speech processing,” IEEE Journal of Selected Topics in Signal Processing, vol. 16, no. 6, pp. 1505–1518, 2022
work page 2022
-
[5]
Murphy-Chutorian, E., & Trivedi, M. M. (2008). Head pose estimation in computer vision: A survey. IEEE transactions on pattern analysis and machine intelligence, 31(4), 607-626
work page 2008
-
[6]
Vicente, F., Huang, Z., Xiong, X., De la Torre, F., Zhang, W., & Levi, D. (2015). Driver gaze tracking and eyes off the road detection system. IEEE Transactions on Intelligent Transportation Systems, 16(4), 2014-2027
work page 2015
-
[7]
Lohani, M., Payne, B. R., & Strayer, D. L. (2019). A review of psychophysiological measures to assess cognitive states in real-world driving. Frontiers in human neuroscience, 13, 57
work page 2019
-
[8]
Ramzan, M., Khan, H. U., Awan, S. M., Ismail, A., Ilyas, M., & Mahmood, A. (2019). A survey on state-of-the-art drowsiness detection techniques. IEEE Access, 7, 61904-61919
work page 2019
-
[9]
Kashevnik, A., Shchedrin, R., Kaiser, C., & Stocker, A. (2021). Driver distraction detection methods: A literature review and framework. IEEE Access, 9, 60063-60076
work page 2021
-
[10]
Paxion, J., Galy, E., & Berthelon, C. (2014). Mental workload and driving. Frontiers in psychology, 5, 1344
work page 2014
-
[11]
Deo, N., & Trivedi, M. M. (2019). Looking at the driver/rider in autonomous vehicles to predict take-over readiness. IEEE Transactions on Intelligent Vehicles, 5(1), 41-52. Greer 17
work page 2019
-
[12]
Greer, R., Deo, N., Rangesh, A., Gunaratne, P., & Trivedi, M. (2023). Safe control transitions: Machine vision based observable readiness index and data-driven takeover time prediction. International Technical Symposium on the Enhanced Safety of Vehicles (ESV)
work page 2023
-
[13]
Roy, P., Perisetla, S., Shriram, S., Krishnaswamy, H., Keskar, A., and Greer, R., “doScenes: An Autonomous Driving Dataset with Natural Language Instruction for Human Interaction and Vision–Language Navigation,” arXiv preprint arXiv:2412.05893 , 2024
-
[14]
nuScenes: A multimodal dataset for autonomous driving,
H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom, “nuScenes: A multimodal dataset for autonomous driving,” in Proc. IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), 2020
work page 2020
-
[15]
Al-Quraishi, M. S., Ali, S. S. A., Muhammad, A. Q., Tang, T. B., & Elferik, S. (2024). Technologies for detecting and monitoring drivers' states: A systematic review. Heliyon, 10(20)
work page 2024
-
[16]
Sahayadhas, A., Sundaraj, K., & Murugappan, M. (2012). Detecting driver drowsiness based on sensors: a review. Sensors, 12(12), 16937-16953
work page 2012
-
[17]
Nair, A., Patil, V., Nair, R., Shetty, A., & Cherian, M. (2024). A review on recent driver safety systems and its emerging solutions. International Journal of Computers and Applications, 46(3), 137-151
work page 2024
-
[18]
Zhang, F., Su, J., Geng, L., & Xiao, Z. (2017, February). Driver fatigue detection based on eye state recognition. In 2017 International Conference on Machine Vision and Information Technology (CMVIT) (pp. 105-110). IEEE
work page 2017
-
[19]
Sigari, M. H., Pourshahabi, M. R., Soryani, M., & Fathy, M. (2014). A review on driver face monitoring systems for fatigue and distraction detection. International Journal of Advanced Science and Technology, 64, 73-100
work page 2014
-
[20]
Suffoletto, B., Anwar, A., Glaister, S., & Sejdic, E. (2023). Detection of alcohol intoxication using voice features: a controlled laboratory study. Journal of studies on alcohol and drugs, 84(6), 808-813
work page 2023
-
[21]
Yin, B., Chen, F., Ruiz, N., & Ambikairajah, E. (2008, March). Speech-based cognitive load monitoring system. In 2008 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 2041-2044). IEEE
work page 2008
-
[22]
Van Puyvelde, M., Neyt, X., McGlone, F., & Pattyn, N. (2018). Voice stress analysis: A new framework for voice and effort in human performance. Frontiers in psychology, 9, 1994
work page 2018
-
[23]
Amato, F., Cesarini, V., Olmo, G., Saggio, G., & Costantini, G. (2025). Beyond breathalyzers: AI-powered speech analysis for alcohol intoxication detection. Expert Systems with Applications, 262, 125656
work page 2025
-
[24]
Ashmead, D. H., Grantham, D. W., Maloff, E. S., Hornsby, B., Nakamura, T., Davis, T. J., ... & Rushing, E. G. (2012). Auditory perception of motor vehicle travel paths. Human factors, 54(3), 437-453
work page 2012
- [25]
-
[26]
Albaji, A. O., Rashid, R. B. A., & Abdul Hamid, S. Z. (2023). Investigation on machine learning approaches for environmental noise classifications. Journal of Electrical and Computer Engineering, 2023(1), 3615137
work page 2023
-
[27]
S., Amaya Guzmán, B., Aya-Parra, P
Remolina Soto, M. S., Amaya Guzmán, B., Aya-Parra, P. A., Perdomo, O. J., Becerra-Fernandez, M., & Sarmiento-Rojas, J. (2025). Intelligent Classification of Urban Noise Sources Using TinyML: Towards Efficient Noise Management in Smart Cities. Sensors, 25(20), 6361
work page 2025
-
[28]
Liang, R., Liu, W., Li, W., & Wu, Z. (2022). A traffic noise source identification method for buildings adjacent to multiple transport infrastructures based on deep learning. Building and Environment, 211, 108764
work page 2022
-
[29]
E., Heck, L., Peters, S., & Hansen, J
Weng, F., Angkititrakul, P., Shriberg, E. E., Heck, L., Peters, S., & Hansen, J. H. (2016). Conversational in-vehicle dialog systems: The past, present, and future. IEEE Signal Processing Magazine, 33(6), 49-60
work page 2016
-
[30]
Hu, T., Liu, X., Wang, S., Zhu, Y., Liang, A., Kong, L., ... & Liang, J. (2025). Vision-language-action models for autonomous driving: Past, present, and future. arXiv preprint arXiv:2512.16760
-
[31]
Anderson, P., Wu, Q., Teney, D., Bruce, J., Johnson, M., Sünderhauf, N., ... & Van Den Hengel, A. (2018). Vision-and-language navigation: Interpreting visually-grounded navigation instructions in real Greer 18 environments. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3674-3683)
work page 2018
-
[32]
Zhu, Y., Wang, S., Zhong, W., Shen, N., Li, Y., Wang, S., ... & Li, L. (2025). A Survey on Large Language Model-Powered Autonomous Driving. Engineering
work page 2025
- [33]
-
[34]
Huang, Z., Sheng, Z., Qu, Y., You, J., & Chen, S. (2025). Vlm-rl: A unified vision language models and reinforcement learning framework for safe autonomous driving. Transportation Research Part C: Emerging Technologies, 180, 105321
work page 2025
-
[35]
W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S.,
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., ... & Sutskever, I. (2021, July). Learning transferable visual models from natural language supervision. In International conference on machine learning (pp. 8748-8763). PmLR
work page 2021
-
[36]
Gao, H., Wang, Z., Li, Y., Long, K., Yang, M., & Shen, Y. (2025, September). A survey for foundation models in autonomous driving. In 2025 6th International Conference on Computer Vision and Data Mining (ICCVDM) (pp. 63-71). IEEE
work page 2025
-
[37]
Sathyam, R., & Li, Y. (2025). Foundation Models for Autonomous Driving Perception: A Survey Through Core Capabilities. IEEE Open Journal of Vehicular Technology
work page 2025
-
[38]
M., Shannon, T., & Tippelhofer, M
Tawari, A., Sivaraman, S., Trivedi, M. M., Shannon, T., & Tippelhofer, M. (2014, June). Looking-in and looking-out vision for urban intelligent assistance: Estimation of driver attentive state and dynamic surround for safe merging and braking. In 2014 IEEE Intelligent Vehicles Symposium Proceedings (pp. 115-120). IEEE
work page 2014
-
[39]
Rangesh, A., Deo, N., Greer, R., Gunaratne, P., & Trivedi, M. M. (2021, September). Autonomous vehicles that alert humans to take-over controls: Modeling with real-world data. In 2021 IEEE International Intelligent Transportation Systems Conference (ITSC) (pp. 231-236). IEEE
work page 2021
-
[40]
Owechko, Y., Srinivasa, N., Medasani, S., & Boscolo, R. (2002, June). Vision-based fusion system for smart airbag applications. In Intelligent Vehicle Symposium, 2002. IEEE (Vol. 1, pp. 245-250). IEEE
work page 2002
-
[41]
Ho, C., & Spence, C. (2005). Assessing the effectiveness of various auditory cues in capturing a driver's visual attention. Journal of experimental psychology: Applied, 11(3), 157
work page 2005
-
[42]
Wiese, E. E., & Lee, J. D. (2004). Auditory alerts for in-vehicle information systems: The effects of temporal conflict and sound parameters on driver attitudes and performance. Ergonomics, 47(9), 965-986
work page 2004
-
[43]
Gray, R. (2011). Looming auditory collision warnings for driving. Human factors, 53(1), 63-74. 45. Li, X., & Xu, Z. (2024, November). The Impact of Auditory Warning Types and Emergency Obstacle Avoidance Takeover Scenarios on Takeover Behavior. In Proceedings of the 26th International Conference on Multimodal Interaction (pp. 134-143)
work page 2011
-
[44]
Sun, P., Kretzschmar, H., Dotiwalla, X., Chouard, A., Patnaik, V., Tsui, P., ... & Anguelov, D. (2020). Scalability in perception for autonomous driving: Waymo open dataset. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2446-2454)
work page 2020
-
[45]
Caesar, H., Bankiti, V., Lang, A. H., Vora, S., Liong, V. E., Xu, Q., ... & Beijbom, O. (2020). nuscenes: A multimodal dataset for autonomous driving. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11621-11631)
work page 2020
-
[46]
F., Lambert, J., Sangkloy, P., Singh, J., Bak, S., Hartnett, A.,
Chang, M. F., Lambert, J., Sangkloy, P., Singh, J., Bak, S., Hartnett, A., ... & Hays, J. (2019). Argoverse: 3d tracking and forecasting with rich maps. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8748-8757)
work page 2019
- [47]
-
[48]
Liu, M., Yurtsever, E., Fossaert, J., Zhou, X., Zimmer, W., Cui, Y., ... & Knoll, A. C. (2024). A survey on autonomous driving datasets: Statistics, annotation quality, and a future outlook. IEEE Transactions on Intelligent Vehicles. Greer 19
work page 2024
-
[49]
Greer, R., Antoniussen, B., Møgelmose, A., & Trivedi, M. (2025). Language-driven active learning for diverse open-set 3d object detection. In Proceedings of the Winter Conference on Applications of Computer Vision (pp. 980-988)
work page 2025
-
[50]
Keskar, A., Perisetla, S., & Greer, R. (2025). Evaluating multimodal vision-language model prompting strategies for visual question answering in road scene understanding. In Proceedings of the Winter Conference on Applications of Computer Vision (pp. 1027-1036)
work page 2025
-
[51]
Gopalkrishnan, A., Greer, R., & Trivedi, M. Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering in Autonomous Driving. In First Vision and Language for Autonomous Driving and Robotics Workshop at CVPR 2024
work page 2024
-
[52]
Shriram, S., Perisetla, S., Keskar, A., Krishnaswamy, H., Bossen, T. E. W., Møgelmose, A., & Greer, R. (2025). Towards a multi-agent vision-language system for zero-shot novel hazardous object detection for autonomous driving safety. IEEE RAS Conference on Automation Science and Engineering, 2025
work page 2025
-
[53]
Choi, L., & Greer, R. (2024, October). Evaluating vision-language models for zero-shot detection, classification, and association of motorcycles, passengers, and helmets. In 2024 IEEE 100th Vehicular Technology Conference (VTC2024-Fall) (pp. 1-7). IEEE
work page 2024
- [54]
-
[55]
Park, S., Lee, M., Kang, J., Choi, H., Park, Y., Cho, J., Lee, A. and Kim, D. (2024). Vlaad: Vision and language assistant for autonomous driving. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 980-987)
work page 2024
-
[56]
Arai, H., Miwa, K., Sasaki, K., Watanabe, K., Yamaguchi, Y., Aoki, S. and Yamamoto, I. (2025). Covla: Comprehensive vision-language-action dataset for autonomous driving. In 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (pp. 1933-1943). IEEE
work page 2025
-
[57]
Fingscheidt, T., Gottschalk, H. and Houben, S. (2022). Deep neural networks and data for automated driving: Robustness, uncertainty quantification, and insights towards safety (p. 427). Springer Nature
work page 2022
-
[58]
Dubnov, S. and Greer, R. (2023). Deep and shallow: Machine learning in music and audio. Chapman and Hall/CRC
work page 2023
-
[59]
Ittichaichareon, C., Suksri, S., & Yingthawornsuk, T. (2012). Speech recognition using MFCC. In International conference on computer graphics, simulation and modeling (Vol. 9, p. 2012)
work page 2012
-
[60]
Vaessen, N. and Van Leeuwen, D.A. (2022). Fine-tuning wav2vec2 for speaker recognition. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 7967-7971). IEEE
work page 2022
-
[61]
Xue, W., Cucchiarini, C., van Hout, R.W.N.M. and Strik, H. (2019). Acoustic correlates of speech intelligibility. the usability of the egemaps feature set for atypical speech
work page 2019
-
[62]
Martinez-Sanchez, A., Roy, P., Greer, R. (2026). Natural Language Instructions for Scene-Responsive Human-In-The-Loop Motion Planning in Autonomous Driving Using Vision-Language-Action Models. arXiv preprint
work page 2026
-
[63]
Bossen, T.E., Møgelmose, A. and Greer, R. (2025). Can Vision-Language Models Understand and Interpret Dynamic Gestures from Pedestrians? Pilot Datasets and Exploration Towards Instructive Nonverbal Commands for Cooperative Autonomous Vehicles. In Proceedings of the Computer Vision and Pattern Recognition Conference (pp. 4779-4788). Greer 20
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.