pith. machine review for the scientific record. sign in

arxiv: 2310.01728 · v2 · submitted 2023-10-03 · 💻 cs.LG · cs.AI

Recognition: 2 theorem links

· Lean Theorem

Time-LLM: Time Series Forecasting by Reprogramming Large Language Models

Authors on Pith no claims yet

Pith reviewed 2026-05-16 15:59 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords time series forecastinglarge language modelsreprogrammingfew-shot learningzero-shot learningmultimodal alignmentprompt engineering
0
0 comments X

The pith

Reprogramming time series inputs with text prototypes lets frozen large language models generate accurate forecasts without retraining.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Time-LLM as a framework that converts time series patches into text prototypes, prepends a guiding prompt, feeds the result into an unchanged LLM, and projects the output patches into forecasts. This setup aims to transfer the sequence reasoning already present in the LLM to time series tasks. A reader would care because it offers one model backbone for many forecasting problems instead of building separate specialized networks for each domain or data regime. Evaluations across benchmarks show the approach beats existing dedicated forecasting models and maintains strong results when only a few or zero training examples are available.

Core claim

By first mapping time series values to text prototypes and then applying Prompt-as-Prefix enrichment, a frozen large language model can be directly repurposed to produce time series forecasts that surpass current specialized models in both standard and low-data settings.

What carries the argument

The reprogramming step that replaces raw time series patches with learned text prototypes before they enter the frozen LLM, combined with Prompt-as-Prefix to steer the model's patch-level transformations.

If this is right

  • Forecasting tasks no longer require training a new architecture from scratch for each application.
  • The same frozen LLM can serve both natural language and time series inputs after the reprogramming step.
  • Few-shot and zero-shot performance improves because the LLM already encodes broad sequence patterns from its original training.
  • Model updates can focus only on the lightweight reprogramming and projection layers rather than the full backbone.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same text-prototype alignment might extend to other ordered data such as audio waveforms or sensor streams without changing the LLM.
  • Freezing the backbone could lower the compute needed to adapt forecasting systems to new domains.
  • Different choices of text prototypes could be tested to see which best retain fine-grained timing information.

Load-bearing premise

The conversion of time series patches into text prototypes plus the added prompt preserves enough temporal structure for the LLM's existing reasoning to apply without major distortion or artifact.

What would settle it

On a benchmark containing strong periodic or trending series, the reprogrammed LLM produces forecasts whose error pattern matches or exceeds that of a simple linear trend model rather than capturing the periodicity.

read the original abstract

Time series forecasting holds significant importance in many real-world dynamic systems and has been extensively studied. Unlike natural language process (NLP) and computer vision (CV), where a single large model can tackle multiple tasks, models for time series forecasting are often specialized, necessitating distinct designs for different tasks and applications. While pre-trained foundation models have made impressive strides in NLP and CV, their development in time series domains has been constrained by data sparsity. Recent studies have revealed that large language models (LLMs) possess robust pattern recognition and reasoning abilities over complex sequences of tokens. However, the challenge remains in effectively aligning the modalities of time series data and natural language to leverage these capabilities. In this work, we present Time-LLM, a reprogramming framework to repurpose LLMs for general time series forecasting with the backbone language models kept intact. We begin by reprogramming the input time series with text prototypes before feeding it into the frozen LLM to align the two modalities. To augment the LLM's ability to reason with time series data, we propose Prompt-as-Prefix (PaP), which enriches the input context and directs the transformation of reprogrammed input patches. The transformed time series patches from the LLM are finally projected to obtain the forecasts. Our comprehensive evaluations demonstrate that Time-LLM is a powerful time series learner that outperforms state-of-the-art, specialized forecasting models. Moreover, Time-LLM excels in both few-shot and zero-shot learning scenarios.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes Time-LLM, a reprogramming framework that repurposes frozen large language models for general time series forecasting. Time series inputs are aligned to the LLM via text prototypes, augmented with Prompt-as-Prefix (PaP) to direct reasoning over patches, and projected to forecasts; the LLM backbone remains frozen. The central claim is that this yields a powerful time series learner that outperforms specialized SOTA models on standard benchmarks while also excelling in few-shot and zero-shot settings.

Significance. If the results hold, the work provides evidence that pre-trained LLMs can be effectively aligned to time series without parameter updates, offering a path toward unified foundation models for forecasting and reducing reliance on task-specific architectures. The reported few-shot and zero-shot performance would be a notable strength if substantiated.

major comments (2)
  1. [Experiments] Experiments section: the paper benchmarks Time-LLM against specialized forecasting models but omits the critical control of replacing the frozen LLM with a randomly initialized model of identical architecture and size (keeping reprogramming layers, PaP, and output projection trainable and identical). Without this ablation, it is impossible to isolate whether gains derive from the LLM's pre-trained reasoning or from the learned input/output mappings alone, directly weakening support for the modality-alignment and transfer claim.
  2. [Section 3] Section 3 (Methodology): the description of text-prototype reprogramming and PaP does not provide quantitative verification that temporal structure is preserved without introducing artifacts; the assumption that this alignment successfully transfers LLM capabilities is load-bearing for the central claim yet rests on indirect evidence from end-to-end performance.
minor comments (2)
  1. [Abstract] Abstract: 'natural language process (NLP)' should read 'natural language processing (NLP)'.
  2. [Figures] Ensure all figures (e.g., the framework diagram) include explicit labels for the reprogramming, PaP, and projection stages to improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The suggested ablation and additional verification will help strengthen the evidence for our claims on modality alignment and the role of pre-trained LLMs. We address each major comment below and will incorporate the revisions in the next version of the manuscript.

read point-by-point responses
  1. Referee: [Experiments] Experiments section: the paper benchmarks Time-LLM against specialized forecasting models but omits the critical control of replacing the frozen LLM with a randomly initialized model of identical architecture and size (keeping reprogramming layers, PaP, and output projection trainable and identical). Without this ablation, it is impossible to isolate whether gains derive from the LLM's pre-trained reasoning or from the learned input/output mappings alone, directly weakening support for the modality-alignment and transfer claim.

    Authors: We agree this ablation is important to isolate the contribution of the pre-trained LLM. In the revised manuscript, we will add results for a randomly initialized LLM backbone of identical architecture and size, with all reprogramming layers, PaP, and the output projection kept trainable and identical. This control will clarify whether performance gains arise primarily from the pre-trained weights or from the learned input/output components alone. revision: yes

  2. Referee: [Section 3] Section 3 (Methodology): the description of text-prototype reprogramming and PaP does not provide quantitative verification that temporal structure is preserved without introducing artifacts; the assumption that this alignment successfully transfers LLM capabilities is load-bearing for the central claim yet rests on indirect evidence from end-to-end performance.

    Authors: We acknowledge that direct quantitative checks would provide stronger support. In the revision, we will augment Section 3 with quantitative verification, including similarity metrics (e.g., cosine similarity and Pearson correlation) between original time-series patches and their reprogrammed representations, as well as reconstruction error analysis to confirm that temporal structure is preserved with minimal artifacts. We will also add a brief discussion of how the text-prototype design and PaP maintain sequential dependencies. revision: yes

Circularity Check

0 steps flagged

Empirical reprogramming framework with no circular derivation

full rationale

The paper describes a practical reprogramming method (text prototypes + Prompt-as-Prefix) whose performance claims rest on external benchmark comparisons rather than any closed-form derivation or fitted parameter renamed as prediction. No equations reduce reported forecasting accuracy to quantities defined solely inside the paper; the central transfer assumption is tested via few-shot/zero-shot results against specialized models. Self-citations, if present, are not load-bearing for the core result.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that LLMs already possess transferable sequence reasoning and that modality alignment via prototypes is feasible; no new physical entities or free parameters are introduced in the abstract description.

axioms (1)
  • domain assumption Large language models possess robust pattern recognition and reasoning abilities over complex sequences of tokens.
    Invoked in the abstract as the foundation for repurposing LLMs to time series.

pith-pipeline@v0.9.0 · 5595 in / 1176 out tokens · 44481 ms · 2026-05-16T15:59:06.403511+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 18 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. What if Tomorrow is the World Cup Final? Counterfactual Time Series Forecasting with Textual Conditions

    cs.LG 2026-05 unverdicted novelty 7.0

    Introduces the task of counterfactual time series forecasting with textual conditions plus a text-attribution mechanism that improves accuracy by distinguishing mutable from immutable factors.

  2. LLaTiSA: Towards Difficulty-Stratified Time Series Reasoning from Visual Perception to Semantics

    cs.AI 2026-04 unverdicted novelty 7.0

    LLaTiSA is a vision-language model trained on a new 83k-sample hierarchical time series reasoning dataset that shows superior performance and out-of-distribution generalization on stratified TSR tasks.

  3. TimeSeriesExamAgent: Creating Time Series Reasoning Benchmarks at Scale

    cs.AI 2026-04 conditional novelty 7.0

    TimeSeriesExamAgent combines templates and LLM agents to generate scalable time series reasoning benchmarks, demonstrating that current LLMs have limited performance on both abstract and domain-specific tasks.

  4. Discrete Prototypical Memories for Federated Time Series Foundation Models

    cs.LG 2026-04 unverdicted novelty 7.0

    FeDPM learns and aligns local discrete prototypical memories across domains to create a unified discrete latent space for LLM-based time series foundation models in a federated setting.

  5. Overcoming the Modality Gap in Context-Aided Forecasting

    cs.LG 2026-03 unverdicted novelty 7.0

    A semi-synthetic augmentation creates the CAF-7M dataset and demonstrates that improved context data enables multimodal models to outperform unimodal baselines in context-aided forecasting.

  6. TS-Haystack: A Multi-Task Retrieval Benchmark for Long-Context Time-Series Reasoning

    cs.LG 2026-02 unverdicted novelty 7.0

    TS-Haystack benchmark shows time-series language models degrade sharply on long contexts while an agentic retrieval system using classifier tools matches or beats them on 9 of 10 tasks.

  7. Is Flow Matching Just Trajectory Replay for Sequential Data?

    stat.ML 2026-02 unverdicted novelty 7.0

    Flow matching on time series targets a closed-form nonparametric velocity field that is a similarity-weighted mixture of observed transition velocities, making neural models approximations to an ideal memory-augmented...

  8. Agent-Based Post-Hoc Correction of Agricultural Yield Forecasts

    cs.LG 2026-05 unverdicted novelty 6.0

    Structured LLM agents correct agricultural yield forecasts from models like XGBoost, cutting MAE by 20-28% and MASE by up to 66% on strawberry and corn datasets.

  9. GazeMind: A Gaze-Guided LLM Agent for Personalized Cognitive Load Assessment

    cs.HC 2026-05 unverdicted novelty 6.0

    GazeMind encodes gaze data for LLM reasoning to deliver interpretable, personalized cognitive load predictions that generalize across tasks without fine-tuning and outperform baselines by over 20% on a new 152-person dataset.

  10. Exploring the Potential of Probabilistic Transformer for Time Series Modeling: A Report on the ST-PT Framework

    cs.LG 2026-04 unverdicted novelty 6.0

    ST-PT turns transformers into explicit factor graphs for time series, enabling structural injection of symbolic priors, per-sample conditional generation, and principled latent autoregressive forecasting via MFVI iterations.

  11. CAARL: In-Context Learning for Interpretable Co-Evolving Time Series Forecasting

    cs.LG 2026-04 unverdicted novelty 6.0

    CAARL decomposes co-evolving time series into autoregressive segments, builds a temporal dependency graph, serializes it into a narrative, and uses LLMs for interpretable forecasting via chain-of-thought reasoning.

  12. Semantic Communication with an LLM-enabled Knowledge Base

    eess.SP 2026-04 unverdicted novelty 6.0

    SC-LMKB uses LLM-generated data with cross-domain fusion to cut hallucinations and delivers up to 72.6% gains on cross-modality retrieval tasks over standard semantic communication.

  13. Uncertainty-Aware Foundation Models for Clinical Data

    cs.LG 2026-04 unverdicted novelty 6.0

    The work introduces uncertainty-aware foundation models for clinical data by learning set-valued patient representations that enforce consistency across partial observations and integrate multimodal self-supervised ob...

  14. Probabilistic NDVI Forecasting from Sparse Satellite Time Series and Weather Covariates

    cs.LG 2026-02 conditional novelty 6.0

    A neural architecture with a horizon-weighted quantile loss forecasts field-level NDVI from irregular satellite observations and weather covariates, outperforming baselines on European data.

  15. Heterogeneous Scientific Foundation Model Collaboration

    cs.AI 2026-04 unverdicted novelty 5.0

    Eywa enables language-based agentic AI systems to collaborate with specialized scientific foundation models for improved performance on structured data tasks.

  16. Frozen LLMs as Map-Aware Spatio-Temporal Reasoners for Vehicle Trajectory Prediction

    cs.CV 2026-04 unverdicted novelty 5.0

    A framework encodes observed trajectories and HD maps into tokens for frozen LLMs to perform spatio-temporal reasoning and predict future vehicle paths with a linear decoder.

  17. MSTN: A Lightweight and Fast Model for General TimeSeries Analysis

    cs.LG 2025-11 unverdicted novelty 4.0

    MSTN is a lightweight hybrid model that reports new state-of-the-art results on 33 of 40 time series benchmarks for imputation, forecasting, and classification while using under one million parameters and sub-second i...

  18. A Review of Large Language Models for Stock Price Forecasting from a Hedge-Fund Perspective

    q-fin.PR 2026-04 unverdicted novelty 3.0

    This review synthesizes LLM uses in stock forecasting and catalogs key practical pitfalls from a hedge-fund viewpoint.

Reference graph

Works this paper leans on

112 extracted references · 112 canonical work pages · cited by 18 Pith papers · 4 internal anchors

  1. [1]

    Kingma and Jimmy Ba , title =

    Diederik P. Kingma and Jimmy Ba , title =. International Conference on Learning Representations , year =

  2. [3]

    Advances in Neural Information Processing Systems , volume=

    Language models are few-shot learners , author=. Advances in Neural Information Processing Systems , volume=

  3. [4]

    IEEE Transactions on Neural Networks and Learning Systems , year=

    Cross reconstruction transformer for self-supervised time series representation learning , author=. IEEE Transactions on Neural Networks and Learning Systems , year=

  4. [5]

    Advances in Neural Information Processing Systems , volume=

    Self-supervised contrastive pre-training for time series via time-frequency consistency , author=. Advances in Neural Information Processing Systems , volume=

  5. [7]

    Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining , pages=

    A transformer-based framework for multivariate time series representation learning , author=. Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining , pages=

  6. [8]

    Proceedings of 25th European Symposium on Artificial Neural Networks , year=

    TimeNet: Pre-trained deep recurrent neural network for time series classification , author=. Proceedings of 25th European Symposium on Artificial Neural Networks , year=

  7. [9]

    IEEE International Conference on Big Data , pages=

    Transfer learning for time series classification , author=. IEEE International Conference on Big Data , pages=. 2018 , organization=

  8. [10]

    International Conference on Machine Learning , pages=

    Voice2series: Reprogramming acoustic models for time series classification , author=. International Conference on Machine Learning , pages=. 2021 , organization=

  9. [12]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Informer: Beyond efficient transformer for long sequence time-series forecasting , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  10. [13]

    Advances in Neural Information Processing Systems , volume=

    Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting , author=. Advances in Neural Information Processing Systems , volume=

  11. [14]

    International Journal of Forecasting , volume=

    A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting , author=. International Journal of Forecasting , volume=. 2020 , publisher=

  12. [15]

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages=

    Deep residual learning for image recognition , author=. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages=

  13. [17]

    Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , year=

    Bert: Pre-training of deep bidirectional transformers for language understanding , author=. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , year=

  14. [18]

    2015 , publisher=

    Time series analysis: forecasting and control , author=. 2015 , publisher=

  15. [19]

    Reviews of Geophysics , volume=

    Climate modeling , author=. Reviews of Geophysics , volume=. 1974 , publisher=

  16. [21]

    The 32nd International Joint Conference on Artificial Intelligence , year=

    pTSE: A Multi-model Ensemble Method for Probabilistic Time Series Forecasting , author=. The 32nd International Joint Conference on Artificial Intelligence , year=

  17. [22]

    Proceedings of the 7th Annual Conference on Robot Learning , year=

    Large Language Models as General Pattern Machines , author=. Proceedings of the 7th Annual Conference on Robot Learning , year=

  18. [25]

    2023 , eprint=

    GPT-4 Technical Report , author=. 2023 , eprint=

  19. [26]

    IEEE Transactions on Power Systems , volume=

    The time series approach to short term load forecasting , author=. IEEE Transactions on Power Systems , volume=. 1987 , publisher=

  20. [27]

    Transfusion , volume=

    From demand forecasting to inventory ordering decisions for red blood cells through integrating machine learning, statistical modeling, and inventory optimization , author=. Transfusion , volume=. 2022 , publisher=

  21. [28]

    with exhibits , volume=

    Promotional analysis and forecasting for demand planning: a practical time series approach , author=. with exhibits , volume=. 2001 , publisher=

  22. [29]

    Neural computation , volume=

    Long short-term memory , author=. Neural computation , volume=. 1997 , publisher=

  23. [30]

    OpenAI blog , volume=

    Language models are unsupervised multitask learners , author=. OpenAI blog , volume=

  24. [32]

    International Conference on Learning Representations , year=

    A Time Series is Worth 64 Words: Long-term Forecasting with Transformers , author=. International Conference on Learning Representations , year=

  25. [33]

    arXiv preprint arXiv:2307.13269 , year=

    LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition , author=. arXiv preprint arXiv:2307.13269 , year=

  26. [34]

    International Conference on Learning Representations , year=

    Reversible instance normalization for accurate time-series forecasting against distribution shift , author=. International Conference on Learning Representations , year=

  27. [36]

    A Survey of Large Language Models

    A survey of large language models , author=. arXiv preprint arXiv:2303.18223 , year=

  28. [40]

    Advances in Neural Information Processing Systems , volume=

    Multimodal few-shot learning with frozen language models , author=. Advances in Neural Information Processing Systems , volume=

  29. [41]

    Advances in Neural Information Processing Systems , volume=

    One Fits All: Power General Time Series Analysis by Pretrained LM , author=. Advances in Neural Information Processing Systems , volume=

  30. [42]

    International Conference on Learning Representations , year=

    Timesnet: Temporal 2d-variation modeling for general time series analysis , author=. International Conference on Learning Representations , year=

  31. [43]

    International Journal of Forecasting , volume=

    The M4 Competition: Results, findings, conclusion and way forward , author=. International Journal of Forecasting , volume=. 2018 , publisher=

  32. [45]

    International Joint Conference on Artificial Intelligence , year=

    Transformers in time series: A survey , author=. International Joint Conference on Artificial Intelligence , year=

  33. [47]

    Advances in Neural Information Processing Systems , volume=

    Non-stationary transformers: Exploring the stationarity in time series forecasting , author=. Advances in Neural Information Processing Systems , volume=

  34. [48]

    International Conference on Machine Learning , pages=

    Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting , author=. International Conference on Machine Learning , pages=. 2022 , organization=

  35. [49]

    Advances in Neural Information Processing Systems , volume=

    Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting , author=. Advances in Neural Information Processing Systems , volume=

  36. [50]

    IEEE International Conference on Acoustics, Speech and Signal Processing , year=

    SADI: A Self-Adaptive Decomposed Interpretable Framework for Electric Load Forecasting Under Extreme Events , author=. IEEE International Conference on Acoustics, Speech and Signal Processing , year=

  37. [51]

    International Conference on Learning Representations , year=

    Reformer: The efficient transformer , author=. International Conference on Learning Representations , year=

  38. [53]

    Proceedings of the AAAI conference on artificial intelligence , volume=

    Are transformers effective for time series forecasting? , author=. Proceedings of the AAAI conference on artificial intelligence , volume=

  39. [54]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    NHITS: neural hierarchical interpolation for time series forecasting , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  40. [55]

    International Conference on Learning Representations , year=

    N-BEATS: Neural basis expansion analysis for interpretable time series forecasting , author=. International Conference on Learning Representations , year=

  41. [56]

    Advances in Neural Information Processing Systems , year=

    SimMTM: A Simple Pre-Training Framework for Masked Time-Series Modeling , author=. Advances in Neural Information Processing Systems , year=

  42. [57]

    Advances in neural information processing systems , volume=

    Large language models are zero-shot reasoners , author=. Advances in neural information processing systems , volume=

  43. [58]

    Advances in Neural Information Processing Systems , volume=

    Flamingo: a visual language model for few-shot learning , author=. Advances in Neural Information Processing Systems , volume=

  44. [59]

    Proceedings of the 31st ACM International Conference on Information & Knowledge Management , pages=

    Domain adversarial spatial-temporal network: a transferable framework for short-term traffic forecasting across cities , author=. Proceedings of the 31st ACM International Conference on Information & Knowledge Management , pages=

  45. [60]

    Advances in Neural Information Processing Systems , year=

    Qlora: Efficient finetuning of quantized llms , author=. Advances in Neural Information Processing Systems , year=

  46. [61]

    Advances in Neural Information Processing Systems , volume=

    Pytorch: An imperative style, high-performance deep learning library , author=. Advances in Neural Information Processing Systems , volume=

  47. [62]

    Advances in Neural Information Processing Systems , volume=

    Attention is all you need , author=. Advances in Neural Information Processing Systems , volume=

  48. [63]

    Proceedings of the AAAI Conference on Artificial Intelligence , year=

    N-HiTS: Neural Hierarchical Interpolation for Time Series Forecasting , author=. Proceedings of the AAAI Conference on Artificial Intelligence , year=

  49. [64]

    Advances in Neural Information Processing Systems , year=

    Large Language Models Are Zero-Shot Time Series Forecasters , author=. Advances in Neural Information Processing Systems , year=

  50. [65]

    International journal of forecasting , volume=

    The M3-Competition: results, conclusions and implications , author=. International journal of forecasting , volume=. 2000 , publisher=

  51. [66]

    Annual Conference on Neural Information Processing Systems , year=

    Reprogramming Language Models for Molecular Representation Learning , author=. Annual Conference on Neural Information Processing Systems , year=

  52. [67]

    International Conference on Machine Learning , year=

    Reprogramming Pretrained Language Models for Antibody Sequence Infilling , author=. International Conference on Machine Learning , year=

  53. [68]

    The Journal of Machine Learning Research , volume=

    Darts: User-friendly modern machine learning for time series , author=. The Journal of Machine Learning Research , volume=. 2022 , publisher=

  54. [71]

    An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

    Shaojie Bai, J Zico Kolter, and Vladlen Koltun. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271, 2018

  55. [72]

    Time series analysis: forecasting and control

    George EP Box, Gwilym M Jenkins, Gregory C Reinsel, and Greta M Ljung. Time series analysis: forecasting and control. John Wiley & Sons, 2015

  56. [73]

    Language models are few-shot learners

    Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in Neural Information Processing Systems, 33: 0 1877--1901, 2020

  57. [74]

    N-hits: Neural hierarchical interpolation for time series forecasting

    Cristian Challu, Kin G Olivares, Boris N Oreshkin, Federico Garza, Max Mergenthaler, and Artur Dubrawski. N-hits: Neural hierarchical interpolation for time series forecasting. Proceedings of the AAAI Conference on Artificial Intelligence, 2023 a

  58. [75]

    Nhits: neural hierarchical interpolation for time series forecasting

    Cristian Challu, Kin G Olivares, Boris N Oreshkin, Federico Garza Ramirez, Max Mergenthaler Canseco, and Artur Dubrawski. Nhits: neural hierarchical interpolation for time series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pp.\ 6989--6997, 2023 b

  59. [76]

    Llm4ts: Two-stage fine-tuning for time-series forecasting with pre-trained llms

    Ching Chang, Wen-Chih Peng, and Tien-Fu Chen. Llm4ts: Two-stage fine-tuning for time-series forecasting with pre-trained llms. arXiv preprint arXiv:2308.08469, 2023

  60. [77]

    Model reprogramming: Resource-efficient cross-domain machine learning

    Pin-Yu Chen. Model reprogramming: Resource-efficient cross-domain machine learning. arXiv preprint arXiv:2202.10629, 2022

  61. [78]

    Leveraging large language models for pre-trained recommender systems

    Zhixuan Chu, Hongyan Hao, Xin Ouyang, Simeng Wang, Yan Wang, Yue Shen, Jinjie Gu, Qing Cui, Longfei Li, Siqiao Xue, et al. Leveraging large language models for pre-trained recommender systems. arXiv preprint arXiv:2308.10837, 2023

  62. [79]

    Beyond just vision: A review on self-supervised representation learning on multimodal and temporal data

    Shohreh Deldari, Hao Xue, Aaqib Saeed, Jiayuan He, Daniel V Smith, and Flora D Salim. Beyond just vision: A review on self-supervised representation learning on multimodal and temporal data. arXiv preprint arXiv:2206.02353, 2022

  63. [80]

    Qlora: Efficient finetuning of quantized llms

    Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, and Luke Zettlemoyer. Qlora: Efficient finetuning of quantized llms. Advances in Neural Information Processing Systems, 2023

  64. [81]

    Bert: Pre-training of deep bidirectional transformers for language understanding

    Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

  65. [82]

    Transfer learning for time series classification

    Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, and Pierre-Alain Muller. Transfer learning for time series classification. In IEEE International Conference on Big Data, pp.\ 1367--1376. IEEE, 2018

  66. [83]

    Large language models are zero-shot time series forecasters

    Nate Gruver, Marc Anton Finzi, Shikai Qiu, and Andrew Gordon Wilson. Large language models are zero-shot time series forecasters. Advances in Neural Information Processing Systems, 2023

  67. [84]

    Darts: User-friendly modern machine learning for time series

    Julien Herzen, Francesco Lassig, Samuele Giuliano Piazzetta, Thomas Neuer, Leo Tafti, Guillaume Raille, Tomas Van Pottelbergh, Marek Pasieka, Andrzej Skrodzki, Nicolas Huguenin, et al. Darts: User-friendly modern machine learning for time series. The Journal of Machine Learning Research, 23 0 (1): 0 5442--5447, 2022

  68. [85]

    Long short-term memory

    Sepp Hochreiter and J \"u rgen Schmidhuber. Long short-term memory. Neural computation, 9 0 (8): 0 1735--1780, 1997

  69. [86]

    A survey on graph neural networks for time series: Forecasting, classification, imputation, and anomaly detection

    Ming Jin, Huan Yee Koh, Qingsong Wen, Daniele Zambon, Cesare Alippi, Geoffrey I Webb, Irwin King, and Shirui Pan. A survey on graph neural networks for time series: Forecasting, classification, imputation, and anomaly detection. arXiv preprint arXiv:2307.03759, 2023 a

  70. [87]

    Large models for time series and spatio-temporal data: A survey and outlook

    Ming Jin, Qingsong Wen, Yuxuan Liang, Chaoli Zhang, Siqiao Xue, Xue Wang, James Zhang, Yi Wang, Haifeng Chen, Xiaoli Li, et al. Large models for time series and spatio-temporal data: A survey and outlook. arXiv preprint arXiv:2310.10196, 2023 b

  71. [88]

    Reversible instance normalization for accurate time-series forecasting against distribution shift

    Taesung Kim, Jinhee Kim, Yunwon Tae, Cheonbok Park, Jang-Ho Choi, and Jaegul Choo. Reversible instance normalization for accurate time-series forecasting against distribution shift. In International Conference on Learning Representations, 2021

  72. [89]

    Kingma and Jimmy Ba

    Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. International Conference on Learning Representations, 2015

  73. [90]

    Reformer: The efficient transformer

    Nikita Kitaev, ukasz Kaiser, and Anselm Levskaya. Reformer: The efficient transformer. In International Conference on Learning Representations, 2020

  74. [91]

    Large language models are zero-shot reasoners

    Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa. Large language models are zero-shot reasoners. Advances in neural information processing systems, 35: 0 22199--22213, 2022

  75. [92]

    Promotional analysis and forecasting for demand planning: a practical time series approach

    Michael Leonard. Promotional analysis and forecasting for demand planning: a practical time series approach. with exhibits, 1, 2001

  76. [93]

    From demand forecasting to inventory ordering decisions for red blood cells through integrating machine learning, statistical modeling, and inventory optimization

    Na Li, Donald M Arnold, Douglas G Down, Rebecca Barty, John Blake, Fei Chiang, Tom Courtney, Marianne Waito, Rick Trifunov, and Nancy M Heddle. From demand forecasting to inventory ordering decisions for red blood cells through integrating machine learning, statistical modeling, and inventory optimization. Transfusion, 62 0 (1): 0 87--99, 2022

  77. [94]

    Sadi: A self-adaptive decomposed interpretable framework for electric load forecasting under extreme events

    Hengbo Liu, Ziqing Ma, Linxiao Yang, Tian Zhou, Rui Xia, Yi Wang, Qingsong Wen, and Liang Sun. Sadi: A self-adaptive decomposed interpretable framework for electric load forecasting under extreme events. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2023 a

  78. [95]

    Large language models are few-shot health learners

    Xin Liu, Daniel McDuff, Geza Kovacs, Isaac Galatzer-Levy, Jacob Sunshine, Jiening Zhan, Ming-Zher Poh, Shun Liao, Paolo Di Achille, and Shwetak Patel. Large language models are few-shot health learners. arXiv preprint arXiv:2305.15525, 2023 b

  79. [96]

    Non-stationary transformers: Exploring the stationarity in time series forecasting

    Yong Liu, Haixu Wu, Jianmin Wang, and Mingsheng Long. Non-stationary transformers: Exploring the stationarity in time series forecasting. Advances in Neural Information Processing Systems, 35: 0 9881--9893, 2022

  80. [97]

    Leveraging speech ptm, text llm, and emotional tts for speech emotion recognition

    Ziyang Ma, Wen Wu, Zhisheng Zheng, Yiwei Guo, Qian Chen, Shiliang Zhang, and Xie Chen. Leveraging speech ptm, text llm, and emotional tts for speech emotion recognition. arXiv preprint arXiv:2309.10294, 2023

Showing first 80 references.