pith. sign in

arxiv: 2606.12699 · v1 · pith:JZMERVE7new · submitted 2026-06-10 · 💻 cs.LG · cs.AI

LLM-Powered Personalized Glycemic Assessment in Type 2 Diabetes with Wearable Sensor Data

Pith reviewed 2026-06-27 09:54 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords LLMType 2 DiabetesContinuous Glucose MonitorGlycemic AssessmentWearable SensorsGlucose ForecastingPersonalized MedicineAI-READI
0
0 comments X

The pith

GlyLLM combines continuous glucose monitor readings with personal metadata inside a pre-trained LLM to improve forecasting and classification for type 2 diabetes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces GlyLLM, a framework that feeds wearable sensor streams and structured health metadata into a large language model so the model can perform sensor-text semantic abstraction at decision time. Traditional machine-learning approaches rely mainly on past glucose values and ignore individual context, which limits accuracy across patients. GlyLLM uses the LLM's pre-trained knowledge plus the supplied metadata to address this gap. On the AI-READI dataset the method lowers root-mean-squared error in glucose forecasting by 13.66 percent and raises AUROC in diabetes categorization by 13.08 percent relative to standard baselines. An ablation study identifies diabetes surveys and biometric tests as the most influential metadata components.

Core claim

GlyLLM achieves sensor-text semantic abstraction at decision time by integrating continuous glucose monitor data with structured metadata inside a pre-trained large language model, yielding lower forecasting error and higher categorization accuracy than traditional machine-learning baselines on the AI-READI dataset.

What carries the argument

GlyLLM, an LLM-powered framework that performs sensor-text semantic abstraction using pre-trained knowledge plus provided metadata at decision time.

If this is right

  • Glucose forecasting error drops by an average of 13.66 percent RMSE compared with standard machine-learning methods.
  • Diabetes categorization improves by an average of 13.08 percent AUROC compared with standard machine-learning methods.
  • Diabetes surveys and biometric test results contribute more to performance than other categories of health metadata.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same metadata-plus-LLM pattern could be tested on other wearable-driven chronic-disease tasks such as hypertension or sleep-apnea monitoring.
  • Real-time mobile applications could deliver daily glycemic guidance without collecting large amounts of patient-specific training data.
  • If distribution shift proves problematic, lightweight metadata-only adapters might be added without full model retraining.

Load-bearing premise

The pre-trained LLM can reliably translate sensor readings into useful abstractions from metadata alone without task-specific fine-tuning or performance loss from patient distribution shift.

What would settle it

Performance on a held-out patient cohort drawn from a different demographic or sensor distribution falls back to or below the level of traditional ML methods.

Figures

Figures reproduced from arXiv: 2606.12699 by Yanmin Gong, Yifan Gao, Yuanxiong Guo, Yun Shi.

Figure 1
Figure 1. Figure 1: GlyLLM Model Architecture. Text embeddings from static metadata and sensor data embeddings, along with text embeddings from task instruction prompts, are structured as sequential inputs for the backbone LLM. text embedder that encodes 𝑋𝑝 and 𝑋𝑞 into text token embed￾dings, a sensor encoder with an adapter to map 𝑋𝑠 into sensor data embeddings, a backbone LLM to fuse and analyze all the provided information… view at source ↗
Figure 2
Figure 2. Figure 2: Examples of text prompts and sensor data used in two tasks. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Prompt templates used for glucose forecasting. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
read the original abstract

Type 2 Diabetes (T2D) poses an increasing global health threat, demanding effective glycemic assessment to support personalized and improved diabetes care. Wearable sensors such as continuous glucose monitors (CGM) and fitness trackers offer many valuable insights for glycemic assessment. However, effectively analyzing these data requires integration with essential individual-level context. Existing methods are often based on traditional machine learning (ML) and rely primarily on historical blood glucose measurements and overlook personalized information, which limits their performance across diverse diabetes populations. Recent advances in large language models (LLMs) have demonstrated their ability to integrate diverse data modalities while modeling sequential dependencies, motivating the exploration of their potential for personalized glycemic assessment. In this paper, we propose GlyLLM, an LLM-powered framework for modeling CGM-based glycemic dynamics through the integration of wearable sensor data and structured metadata. GlyLLM can leverage the extensive prior knowledge of pre-trained LLMs and achieve sensor-text semantic abstraction at decision time. Experiments on two related tasks on the AI-READI dataset demonstrate that our model outperforms traditional ML methods by an average of 13.66\% in Root Mean Squared Error (RMSE) for glucose forecasting and 13.08\% in Area Under the Receiver Operating Characteristic (AUROC) for diabetes categorization. Additionally, our ablation study shows that diabetes surveys and biometric tests are more critical than other health information for glycemic assessment. Our work presents a promising step toward harnessing the power of LLMs to advance personalized glycemic assessment in T2D care.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes GlyLLM, an LLM-powered framework that integrates CGM wearable sensor data with structured metadata (including diabetes surveys and biometric tests) to model glycemic dynamics for Type 2 diabetes. It reports that GlyLLM outperforms traditional ML baselines by an average of 13.66% RMSE on glucose forecasting and 13.08% AUROC on diabetes categorization tasks using the AI-READI dataset, with an ablation study identifying surveys and biometric tests as the most critical metadata components.

Significance. If the performance margins are reproducible and attributable to the LLM component rather than data-processing choices, the work would represent a meaningful exploration of pre-trained LLMs for multimodal sensor-text integration in personalized glycemic assessment, addressing a gap in traditional ML approaches that overlook individual context.

major comments (3)
  1. [Abstract and §3] Abstract and §3 (Methods): no description is given of CGM time-series tokenization, input formatting for the LLM, use of in-context examples, fine-tuning procedure, or whether the LLM operates zero-shot at inference; without these details the central claim that 'sensor-text semantic abstraction at decision time' drives the reported gains cannot be evaluated.
  2. [§4] §4 (Experiments): the 13.66% RMSE and 13.08% AUROC margins are stated without patient-level train/test splits, baseline implementation details, statistical significance tests, or variance across runs; this prevents verification that improvements are not due to distribution shift or leakage within the AI-READI cohort.
  3. [§4.2] §4.2 (Ablation): the finding that 'diabetes surveys and biometric tests are more critical' is presented without quantitative ablation tables or controls for feature correlation, so it is impossible to assess whether the result is load-bearing for the personalization claim.
minor comments (2)
  1. [Abstract] Abstract: the phrase 'outperforms traditional ML methods by an average of 13.66%' should specify the exact set of baselines and whether the average is macro or weighted.
  2. [§2] Notation: 'sensor-text semantic abstraction' is used without a formal definition or pseudocode showing how metadata is concatenated with CGM sequences.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback, which highlights important gaps in methodological and experimental transparency. We will revise the manuscript to address each point and improve reproducibility and clarity.

read point-by-point responses
  1. Referee: [Abstract and §3] Abstract and §3 (Methods): no description is given of CGM time-series tokenization, input formatting for the LLM, use of in-context examples, fine-tuning procedure, or whether the LLM operates zero-shot at inference; without these details the central claim that 'sensor-text semantic abstraction at decision time' drives the reported gains cannot be evaluated.

    Authors: We agree these details are required to evaluate the LLM's contribution. In the revised §3 we will add a full description of the CGM tokenization process (including how time-series values are discretized and embedded), the precise prompt/input formatting for the LLM, whether in-context examples were used, the fine-tuning procedure (or confirmation of zero-shot operation), and the inference setting. This will directly support the sensor-text abstraction claim. revision: yes

  2. Referee: [§4] §4 (Experiments): the 13.66% RMSE and 13.08% AUROC margins are stated without patient-level train/test splits, baseline implementation details, statistical significance tests, or variance across runs; this prevents verification that improvements are not due to distribution shift or leakage within the AI-READI cohort.

    Authors: We acknowledge that these experimental controls are necessary to rule out leakage and confirm robustness. We will update §4 to explicitly state patient-level train/test splits, provide complete baseline implementation details (hyperparameters, libraries, preprocessing), report statistical significance (e.g., paired tests with p-values), and include variance or standard deviation across multiple random seeds/runs. revision: yes

  3. Referee: [§4.2] §4.2 (Ablation): the finding that 'diabetes surveys and biometric tests are more critical' is presented without quantitative ablation tables or controls for feature correlation, so it is impossible to assess whether the result is load-bearing for the personalization claim.

    Authors: We agree that quantitative tables and correlation controls are needed. We will expand §4.2 with full ablation tables showing performance changes when each metadata type is removed, plus an analysis of feature correlations (e.g., correlation matrix or controlled ablations) to demonstrate that the identified components remain critical after accounting for inter-feature dependencies. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical ML framework with standard evaluation

full rationale

The paper proposes GlyLLM as an LLM integration framework for CGM and metadata, then reports empirical outperformance (13.66% RMSE, 13.08% AUROC) on the AI-READI dataset. No mathematical derivation chain, equations, or first-principles results exist that reduce to inputs by construction. No self-definitional steps, fitted inputs renamed as predictions, or load-bearing self-citations appear in the provided text. The central claims rest on experimental results rather than any closed logical loop, making the work self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the unstated premise that pre-trained LLMs already encode useful priors for glucose dynamics and that metadata can be fused at inference time without additional training details or validation against external cohorts.

axioms (1)
  • domain assumption Pre-trained LLMs contain transferable knowledge sufficient for sensor-text abstraction in glycemic tasks
    Invoked in the motivation paragraph when the authors state that LLMs can integrate diverse data modalities while modeling sequential dependencies.

pith-pipeline@v0.9.1-grok · 5811 in / 1413 out tokens · 15944 ms · 2026-06-27T09:54:37.524546+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

41 extracted references · 4 canonical work pages · 2 internal anchors

  1. [1]

    Global, regional, and national burden of type 2 diabetes mellitus caused by high BMI from 1990 to 2021, and forecasts to 2045: Analysis from the global burden of disease study 2021

    X. Huang, Y . Wu, Y . Ni, H. Xu, and Y . He, “Global, regional, and national burden of type 2 diabetes mellitus caused by high BMI from 1990 to 2021, and forecasts to 2045: Analysis from the global burden of disease study 2021”,Frontiers in Public Health, vol. 13, 2025

  2. [2]

    The burden and risks of emerging complications of diabetes mellitus

    D. Tomic, J. E. Shaw, and D. J. Magliano, “The burden and risks of emerging complications of diabetes mellitus”,Nature Reviews Endocrinology, vol. 18, no. 9, pp. 525–539, 2022

  3. [3]

    Economic costs of diabetes in the U.S. in 2022

    E. D. Parker, J. Lin, T. Mahoney, N. Ume, G. Yang, R. A. Gabbay, N. A. ElSayed, and R. R. Bannuru, “Economic costs of diabetes in the U.S. in 2022”,Diabetes Care, vol. 47, no. 1, pp. 26–43, 2023

  4. [4]

    Mobile and wearable technology for the monitoring of diabetes-related parameters: Systematic review

    C. Rodriguez-Le ´on, C. Villalonga, M. Munoz-Torres, J. R. Ruiz, and O. Banos, “Mobile and wearable technology for the monitoring of diabetes-related parameters: Systematic review”,JMIR mHealth and uHealth, vol. 9, no. 6, p. e25138, 2021

  5. [5]

    Applications of federated learning in mobile health: Scoping review

    T. Wang, Y . Du, Y . Gong, K.-K. R. Choo, and Y . Guo, “Applications of federated learning in mobile health: Scoping review”,Journal of Medical Internet Research, vol. 25, p. e43006, 2023

  6. [6]

    Heterogeneity of continuous glucose monitoring features and their clinical associations in a type 2 diabetes population

    E. Healey, C. Morato, J. Murillo, and I. Kohane, “Heterogeneity of continuous glucose monitoring features and their clinical associations in a type 2 diabetes population”,Diabetes, Obesity and Metabolism, vol. 27, no. 7, pp. 3957–3966, 2025

  7. [7]

    Continuous glucose monitoring data analysis 2.0: Functional data pattern recognition and artificial intelligence applications

    D. C. Klonoff, R. M. Bergenstal, E. Cengiz, M. A. Clements, D. Espes, J. Espinoza, D. Kerr, B. Kovatchev, D. M. Maahs, J. K. Mader, N. Mathioudakis, A. A. Metwally, S. N. Shah, B. Sheng, M. P. Snyder, G. Umpierrez, M. M. Shao, A. F. Scheideman, A. T. Ayers, C. N. Ho, and E. Healey, “Continuous glucose monitoring data analysis 2.0: Functional data pattern ...

  8. [8]

    Deep multitask learning by stacked long short-term memory for predicting personalized blood glucose concentration

    M. M. H. Shuvo, and S. K. Islam, “Deep multitask learning by stacked long short-term memory for predicting personalized blood glucose concentration”,IEEE Journal of Biomedical and Health Informatics, vol. 27, no. 3, pp. 1612–1623, 2023

  9. [9]

    Prediction of metabolic subphenotypes of type 2 diabetes via continuous glucose monitoring and machine learning

    A. A. Metwally, D. Perelman, H. Park, Y . Wu, A. Jha, S. Sharp, A. Celli, E. Ayhan, F. Abbasi, A. L. Gloyn, T. McLaughlin, and M. P. Snyder, “Prediction of metabolic subphenotypes of type 2 diabetes via continuous glucose monitoring and machine learning”,Nature Biomedical Engineering, vol. 9, no. 8, pp. 1222–1239, 2024

  10. [10]

    Multi-horizon glucose prediction across populations with deep domain generalization

    T. Zhu, I. Afentakis, K. Li, R. Armiger, N. Hill, N. Oliver, and P. Georgiou, “Multi-horizon glucose prediction across populations with deep domain generalization”,IEEE Journal of Biomedical and Health Informatics, vol. 29, no. 8, pp. 5424–5437, 2025

  11. [11]

    Perspective on harnessing large language models to uncover insights in diabetes wearable data

    A. Alavi, K. Cha, D. P. Esfarjani, B. Patel, J. L. P. Than, A. Y . Lee, C. Nebeker, M. Snyder, and A. Bahmani, “Perspective on harnessing large language models to uncover insights in diabetes wearable data”, medRxiv preprint medRxiv:2024.07.29.24310315, 2024

  12. [12]

    LLM-CGM: A benchmark for large language model-enabled querying of continuous glucose monitoring data for conversational diabetes management

    E. Healey, and I. Kohane, “LLM-CGM: A benchmark for large language model-enabled querying of continuous glucose monitoring data for conversational diabetes management”, inBiocomputing, pp. 82–93, 2025

  13. [13]

    DM-Bench: Benchmarking LLMs for personalized decision making in diabetes management

    M. A. Cardei, J. Lamp, M. Derdzinski, and K. Bhatia, “DM-Bench: Benchmarking LLMs for personalized decision making in diabetes management”,arXiv preprint arXiv:2510.00038, 2025

  14. [14]

    Empowering digital health management with on-device large language models for glucose prediction

    T. Zhu, J. Howson, and A. Nevado-Holgado, “Empowering digital health management with on-device large language models for glucose prediction”,medRxiv preprint medRxiv:2025.07.12.25331188, 2025

  15. [15]

    Mental-LLM: Leveraging large language models for mental health prediction via online text data

    X. Xu, B. Yao, Y . Dong, S. Gabriel, H. Yu, J. Hendler, M. Ghassemi, A. K. Dey, and D. Wang, “Mental-LLM: Leveraging large language models for mental health prediction via online text data”,Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 8, no. 1, pp. 1–32, 2024

  16. [16]

    Health- LLM: Large language models for health prediction via wearable sensor data

    Y . Kim, X. Xu, D. McDuff, C. Breazeal, and H. W. Park, “Health- LLM: Large language models for health prediction via wearable sensor data”, inProceedings of the 5th Conference on Health, Inference, and Learning, 2024

  17. [17]

    Empowering time series analysis with large language models: A survey

    Y . Jiang, Z. Pan, X. Zhang, S. Garg, A. Schneider, Y . Nevmyvaka, and D. Song, “Empowering time series analysis with large language models: A survey”, inProceedings of the 33rd International Joint Conference on Artificial Intelligence, 2024

  18. [18]

    An image is worth 16x16 words: Trans- formers for image recognition at scale

    A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Trans- formers for image recognition at scale”, inInternational Conference on Learning Representations, 2021

  19. [19]

    A time series is worth 64 words: Long-term forecasting with transformers

    Y . Nie, N. H. Nguyen, P. Sinthong, and J. Kalagnanam, “A time series is worth 64 words: Long-term forecasting with transformers”, in International Conference on Learning Representations, 2023

  20. [20]

    SensorLM: Learning the language of wearable sensors

    Y . Zhang, K. Ayush, S. Qiao, A. A. Heydari, G. Narayanswamy, M. A. Xu, A. Metwally, J. Xu, J. Garrison, X. Xu, T. Althoff, Y . Liu, P. Kohli, J. Zhan, M. Malhotra, S. Patel, C. Mascolo, X. Liu, D. McDuff, and Y . Yang, “SensorLM: Learning the language of wearable sensors”, in 39th Conference on Neural Information Processing Systems, 2025

  21. [21]

    LoRA: Low-rank adaptation of large language models

    E. J. Hu, yelong shen, P. Wallis, Z. Allen-Zhu, Y . Li, S. Wang, L. Wang, and W. Chen, “LoRA: Low-rank adaptation of large language models”, inInternational Conference on Learning Representations, 2022

  22. [22]

    Navigating text-to-image customization: From LyCORIS fine-tuning to model evaluation

    S.-Y . Yeh, Y .-G. Hsieh, Z. Gao, B. B. Yang, G. Oh, and Y . Gong, “Navigating text-to-image customization: From LyCORIS fine-tuning to model evaluation”, inICLR, 2023

  23. [23]

    Federated adaptive fine-tuning of large language models with heterogeneous quantization and LoRA

    Z. Gao, Z. Zhang, Y . Guo, and Y . Gong, “Federated adaptive fine-tuning of large language models with heterogeneous quantization and LoRA”, inIEEE INFOCOM, 2025

  24. [24]

    FedKRSO: Commu- nication and memory efficient federated fine-tuning of large language models

    G. Yang, T. Wu, Y . Guo, Y . Sun, and Y . Gong, “FedKRSO: Commu- nication and memory efficient federated fine-tuning of large language models”, inIEEE INFOCOM, 2026

  25. [25]

    Management of diabetes and hyperglycaemia in the hospital

    F. J. Pasquel, M. C. Lansang, K. Dhatariya, and G. E. Umpierrez, “Management of diabetes and hyperglycaemia in the hospital”,The Lancet Diabetes and Endocrinology, vol. 9, no. 3, pp. 174–188, 2021

  26. [26]

    Ana- lyzing the impact of personalization on fairness in federated learning for healthcare

    T. Wang, K. Zhang, J. Cai, Y . Gong, K.-K. R. Choo, and Y . Guo, “Ana- lyzing the impact of personalization on fairness in federated learning for healthcare”,Journal of Healthcare Informatics Research, vol. 8, no. 2, pp. 181–205, 2024

  27. [27]

    AI-READI: Rethinking data collection, prepa- ration and sharing for propelling AI-based discoveries in diabetes research and beyond

    AI-READI Consortium, “AI-READI: Rethinking data collection, prepa- ration and sharing for propelling AI-based discoveries in diabetes research and beyond”,Nature Metabolism, vol. 6, no. 12, pp. 2210– 2212, 2024

  28. [28]

    Are time series foundation models ready for vital sign forecasting in healthcare?

    X. Gu, Y . Liu, Z. Mohsin, J. Bedford, A. Thakur, P. Watkinson, L. Clifton, T. Zhu, and D. Clifton, “Are time series foundation models ready for vital sign forecasting in healthcare?”, inProceedings of the 4th Machine Learning for Health Symposium, pp. 401–419, 2025

  29. [29]

    A foundation model for continuous glucose monitoring data

    G. Lutsker, G. Sapir, S. Shilo, J. Merino, A. Godneva, J. R. Greenfield, D. Samocha-Bonet, R. Dhir, F. Gude, S. Mannor, E. Meirom, E. P. Xing, G. Chechik, H. Rossman, and E. Segal, “A foundation model for continuous glucose monitoring data”,Nature, vol. 650, no. 8103, pp. 978–986, 2026

  30. [30]

    A pretrained transformer model for decoding individual glucose dynamics from continuous glucose monitoring data

    Y . Lu, D. Liu, Z. Liang, R. Liu, P. Chen, Y . Liu, J. Li, Z. Feng, L. M. Li, B. Sheng, W. Jia, L. Chen, H. Li, and Y . Wang, “A pretrained transformer model for decoding individual glucose dynamics from continuous glucose monitoring data”,National Science Review, vol. 12, no. 5, 2025

  31. [31]

    Integration of artificial intelligence and wearable technology in the management of diabetes and prediabetes

    R. A. Fraser, R. J. Walker, J. A. Campbell, O. Ekwunife, and L. E. Egede, “Integration of artificial intelligence and wearable technology in the management of diabetes and prediabetes”,npj Digital Medicine, vol. 8, no. 1, 2025

  32. [32]

    Med42-v2: A suite of clinical LLMs

    C. Christophe, P. K. Kanithi, T. Raha, S. Khan, and M. A. Pi- mentel, “Med42-v2: A suite of clinical LLMs”,arXiv preprint arXiv:2408.06142, 2024

  33. [33]

    Gemma 2: Improving Open Language Models at a Practical Size

    G. Team, M. Riviere, S. Pathak, P. G. Sessa, C. Hardin, S. Bhupatiraju, L. Hussenot, T. Mesnard, B. Shahriari, A. Ram ´e et al., “Gemma 2: Improving open language models at a practical size”,arXiv preprint arXiv:2408.00118, 2024

  34. [34]

    Mistral 7B

    A. Q. Jiang, A. Sablayrolles, A. Mensch, C. Bamford, D. S. Chaplot, D. de las Casas, F. Bressand et al., “Mistral 7B”,arXiv preprint arXiv:2310.06825, 2023

  35. [35]

    Crossformer: Transformer utilizing cross- dimension dependency for multivariate time series forecasting

    Y . Zhang, and J. Yan, “Crossformer: Transformer utilizing cross- dimension dependency for multivariate time series forecasting”, in International Conference on Learning Representations, 2023

  36. [36]

    iTrans- former: Inverted transformers are effective for time series forecasting

    Y . Liu, T. Hu, H. Zhang, H. Wu, S. Wang, L. Ma, and M. Long, “iTrans- former: Inverted transformers are effective for time series forecasting”, inInternational Conference on Learning Representations, 2024

  37. [37]

    Multilayer feedforward networks are universal approximators

    K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal approximators”,Neural Networks, vol. 2, no. 5, pp. 359–366, 1989

  38. [38]

    Long short-term memory

    S. Hochreiter, and J. Schmidhuber, “Long short-term memory”,Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997

  39. [39]

    Large language models are zero-shot reasoners

    T. Kojima, S. S. Gu, M. Reid, Y . Matsuo, and Y . Iwasawa, “Large language models are zero-shot reasoners”, in36th Conference on Neural Information Processing Systems, 2022

  40. [40]

    Language models are few-shot learners

    T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners”, in34th Conference on Neural Information Processing Systems, 2020

  41. [41]

    Are language models actually useful for time series forecasting?

    M. Tan, M. A. Merrill, V . Gupta, T. Althoff, and T. Hartvigsen, “Are language models actually useful for time series forecasting?”, in38th Conference on Neural Information Processing Systems, 2024