{"total":12,"items":[{"citing_arxiv_id":"2606.07365","ref_index":2,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"A robust PPG foundation model using multimodal physiological supervision","primary_cat":"cs.LG","submitted_at":"2026-06-05T15:08:50+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"A PPG foundation model pretrained via multimodal ECG/respiratory contrastive sample selection on ICU data improves performance on 14 of 15 downstream tasks including field-like data while using 3x fewer subjects.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.07692","ref_index":11,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"BCG-FM: A Foundation Model for Ambient Cardiac Health Sensing","primary_cat":"cs.LG","submitted_at":"2026-06-05T07:07:13+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"BCG-FM, the first foundation model for ambient BCG, achieves 3.26-year MAE on biological age estimation and discriminates 15 health conditions using frozen embeddings from participant-level contrastive pretraining on the largest raw biosignal corpus reported.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.17765","ref_index":12,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"AURORA: Contextual Orthogonalization for Geometric Representation Learning in Healthcare Foundation Models","primary_cat":"cs.LG","submitted_at":"2026-05-18T02:32:50+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"AURORA is a representation learning framework that uses contextual orthogonalization and relational alignment to create disentangled, geometrically interpretable latent spaces in healthcare foundation models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.17618","ref_index":1,"ref_count":2,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Prediction of Challenging Behaviors Associated with Profound Autism in a Classroom Setting Using Wearable Sensors","primary_cat":"cs.AI","submitted_at":"2026-05-17T19:22:48+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Wearable accelerometry, EDA, and temperature data from 9 students with profound autism, processed with fine-tuned foundation models, enables prediction of challenging behavior episodes up to 10 minutes in advance at AUC-ROC 0.78 in actual classroom sessions.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.09765","ref_index":15,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"WISTERIA: Learning Clinical Representations from Noisy Supervision via Multi-View Consistency in Electronic Health Records","primary_cat":"cs.LG","submitted_at":"2026-05-10T21:25:41+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"WISTERIA learns robust clinical representations from noisy EHR labels by enforcing consistency across multiple weak supervision views plus ontology regularization.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"of the IEEE/CVF International Conference on Computer Vision, pages 4015-4026, 2023. Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. Visual instruction tuning.Advances in neural information processing systems, 36, 2024a. Yanrong Ji, Zhihan Zhou, Han Liu, and Ramana V Davuluri. Dnabert: pre-trained bidirectional encoder representations from transformers model for dna-language in genome.Bioinformatics, 37 (15):2112-2120, 2021. Nguyen Quoc Khanh Le, Quang-Thai Ho, Trinh-Trung-Duong Nguyen, and Yu-Yen Ou. A trans- former architecture based on bert and 2d convolutional neural network to identify dna enhancers from sequence information.Briefings in bioinformatics, 22(5):bbab005, 2021. Adi Lin, Bin Xie, Cheng Ye, Cheng Wang, Duoyuan Chen, Ercheng Wang, Fanfeng Lu, Guirong"},{"citing_arxiv_id":"2605.09173","ref_index":10,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"WavesFM: Hierarchical Representation Learning for Longitudinal Wearable Sensor Waveforms","primary_cat":"cs.LG","submitted_at":"2026-05-09T21:22:02+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"WavesFM uses hierarchical SSL to pretrain a segment encoder on short waveforms followed by a temporal encoder on multi-day sequences, outperforming prior methods on 58 tasks after training on over 12 million hours of data from hundreds of thousands of people.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"contrast [11], and cross-model supervision from paired ECG [13]. Beyond contrastive objectives, alternative approaches have explored generative autoregressive training [14], masked reconstruction [15], and multimodal masking [16]. In the domain of accelerometry modeling, approaches employ pretext task prediction [31], relative contrastive learning [32], and knowledge distillation [10]. To our knowledge, no existing foundation model leverages joint modeling of PPG and ACC waveforms. More crucially, while these segment-level approaches capture fine-grained morphological detail, longitudinal structure is recovered only post-hoc via simple aggregation (e.g., mean-pooling [9, 33]), which collapses the multi-day variations that are themselves clinically informative."},{"citing_arxiv_id":"2605.08685","ref_index":15,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Event Fields: Learning Latent Event Structure for Waveform Foundation Models","primary_cat":"cs.LG","submitted_at":"2026-05-09T04:49:56+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Event-centric waveform foundation models are learned via self-supervised consistency on latent event structures and interactions, yielding improved performance and label efficiency over sequence-based baselines on physiological tasks.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Simon A Lee, Anthony Wu, and Jeffrey N Chiang. Clinical modernbert: An efficient and long context encoder for biomedical text.arXiv preprint arXiv:2504.03964, April 2025. 15 Yanrong Ji, Zhihan Zhou, Han Liu, and Ramana V Davuluri. Dnabert: pre-trained bidirectional encoder representations from transformers model for dna-language in genome.Bioinformatics, 37 (15):2112-2120, 2021. Nguyen Quoc Khanh Le, Quang-Thai Ho, Trinh-Trung-Duong Nguyen, and Yu-Yen Ou. A trans- former architecture based on bert and 2d convolutional neural network to identify dna enhancers from sequence information.Briefings in bioinformatics, 22(5):bbab005, 2021. Mingqian Ma, Guoqing Liu, Chuan Cao, Pan Deng, Tri Dao, Albert Gu, Peiran Jin, Zhao Yang,"},{"citing_arxiv_id":"2605.07407","ref_index":49,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Emergent Symbolic Structure in Health Foundation Models: Extraction, Alignment, and Cross-Modal Transfer","primary_cat":"cs.LG","submitted_at":"2026-05-08T08:03:14+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Health foundation model embeddings contain an interpretable symbolic organization shared across modalities that supports cross-domain transfer without joint training.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.04791","ref_index":2,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"OpenWatch: A Multimodal Benchmark for Hand Gesture Recognition on Smartwatches","primary_cat":"cs.HC","submitted_at":"2026-05-06T11:41:31+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"OpenWatch provides the first open multimodal smartwatch gesture dataset and benchmark, with MixToken and NormWear-Lora methods reaching 90% F1-score using 223k parameters versus 66% for 136M-parameter foundation models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.18058","ref_index":30,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Sonata: A Hybrid World Model for Inertial Kinematics under Clinical Data Scarcity","primary_cat":"cs.LG","submitted_at":"2026-04-20T10:26:54+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Sonata is a small hybrid world model pre-trained to predict future IMU states that outperforms autoregressive baselines on clinical discrimination, fall-risk prediction, and cross-cohort transfer while fitting on-device wearables.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"sorLM [26] bridges sensor data and natural language, while AccelFM [27] distils PPG representations into an accelerometer encoder predictive of 46 health targets. Raw-IMU encoders represent a more proximate baseline: PRiMuS [28] leverages EgoExo4D with multimodal self- supervision; UniMTS [29] aligns heterogeneous motion series with natural language; and HiMAE [30] applies hierarchical masked autoencoding-a reconstruction ob- jective that JEPA-style latent prediction directly super- sedes. Three structural limitations persist across all these works:(i)all rely on single-axis accelerometry, never the full 6-axis IMU signal that captures rotational kinematics; (ii)pretraining corpora are predominantly proprietary, ob-"},{"citing_arxiv_id":"2604.10172","ref_index":2,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Wearable AI in the Era of Large Sensor Models","primary_cat":"eess.SP","submitted_at":"2026-04-11T11:41:11+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Large Sensor Models trained on large-scale multimodal wearable data can provide a scalable, general framework for wearable AI by learning transferable representations across modalities and tasks.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"and position sensors, each characterized by heterogeneous temporal dynamics, variable sampling rates, and distinct noise profiles. The central challenge is to learn representa- tions that remain faithful to underlying physical properties while being robust to distribution shifts and measurement imperfections caused by diverse modalities, users, device types, and sensor placements. (2) Modeling Paradigms(Section 4 and 5). We present two development directions for LSMs:(i) LSMs without language capabilityfocus on scaling and systematizing pre- trained sensor models through increased model capacity, expanded multimodal data, and advanced modeling strate- gies.(ii) LSMs with language capabilityintegrate large language models (LLMs) into the sensor modeling pipeline,"},{"citing_arxiv_id":"2604.04175","ref_index":72,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Uncertainty-Aware Foundation Models for Clinical Data","primary_cat":"cs.LG","submitted_at":"2026-04-05T16:44:13+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"The work introduces uncertainty-aware foundation models for clinical data by learning set-valued patient representations that enforce consistency across partial observations and integrate multimodal self-supervised objectives.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}