arxiv: 2310.01728 · v2 · submitted 2023-10-03 · 💻 cs.LG · cs.AI

Recognition: 2 theorem links

· Lean Theorem

Time-LLM: Time Series Forecasting by Reprogramming Large Language Models

Ming Jin , Shiyu Wang , Lintao Ma , Zhixuan Chu , James Y. Zhang , Xiaoming Shi , Pin-Yu Chen , Yuxuan Liang

show 3 more authors

Yuan-Fang Li Shirui Pan Qingsong Wen

Authors on Pith no claims yet

Pith reviewed 2026-05-16 15:59 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords time series forecastinglarge language modelsreprogrammingfew-shot learningzero-shot learningmultimodal alignmentprompt engineering

0 comments

The pith

Reprogramming time series inputs with text prototypes lets frozen large language models generate accurate forecasts without retraining.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Time-LLM as a framework that converts time series patches into text prototypes, prepends a guiding prompt, feeds the result into an unchanged LLM, and projects the output patches into forecasts. This setup aims to transfer the sequence reasoning already present in the LLM to time series tasks. A reader would care because it offers one model backbone for many forecasting problems instead of building separate specialized networks for each domain or data regime. Evaluations across benchmarks show the approach beats existing dedicated forecasting models and maintains strong results when only a few or zero training examples are available.

Core claim

By first mapping time series values to text prototypes and then applying Prompt-as-Prefix enrichment, a frozen large language model can be directly repurposed to produce time series forecasts that surpass current specialized models in both standard and low-data settings.

What carries the argument

The reprogramming step that replaces raw time series patches with learned text prototypes before they enter the frozen LLM, combined with Prompt-as-Prefix to steer the model's patch-level transformations.

If this is right

Forecasting tasks no longer require training a new architecture from scratch for each application.
The same frozen LLM can serve both natural language and time series inputs after the reprogramming step.
Few-shot and zero-shot performance improves because the LLM already encodes broad sequence patterns from its original training.
Model updates can focus only on the lightweight reprogramming and projection layers rather than the full backbone.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same text-prototype alignment might extend to other ordered data such as audio waveforms or sensor streams without changing the LLM.
Freezing the backbone could lower the compute needed to adapt forecasting systems to new domains.
Different choices of text prototypes could be tested to see which best retain fine-grained timing information.

Load-bearing premise

The conversion of time series patches into text prototypes plus the added prompt preserves enough temporal structure for the LLM's existing reasoning to apply without major distortion or artifact.

What would settle it

On a benchmark containing strong periodic or trending series, the reprogrammed LLM produces forecasts whose error pattern matches or exceeds that of a simple linear trend model rather than capturing the periodicity.

read the original abstract

Time series forecasting holds significant importance in many real-world dynamic systems and has been extensively studied. Unlike natural language process (NLP) and computer vision (CV), where a single large model can tackle multiple tasks, models for time series forecasting are often specialized, necessitating distinct designs for different tasks and applications. While pre-trained foundation models have made impressive strides in NLP and CV, their development in time series domains has been constrained by data sparsity. Recent studies have revealed that large language models (LLMs) possess robust pattern recognition and reasoning abilities over complex sequences of tokens. However, the challenge remains in effectively aligning the modalities of time series data and natural language to leverage these capabilities. In this work, we present Time-LLM, a reprogramming framework to repurpose LLMs for general time series forecasting with the backbone language models kept intact. We begin by reprogramming the input time series with text prototypes before feeding it into the frozen LLM to align the two modalities. To augment the LLM's ability to reason with time series data, we propose Prompt-as-Prefix (PaP), which enriches the input context and directs the transformation of reprogrammed input patches. The transformed time series patches from the LLM are finally projected to obtain the forecasts. Our comprehensive evaluations demonstrate that Time-LLM is a powerful time series learner that outperforms state-of-the-art, specialized forecasting models. Moreover, Time-LLM excels in both few-shot and zero-shot learning scenarios.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Time-LLM's text-prototype reprogramming plus Prompt-as-Prefix is a clean way to keep the LLM frozen, but the results don't yet isolate whether the pre-trained model is actually doing useful reasoning or if the adapters alone explain the gains.

read the letter

The core idea is straightforward: map time series patches to text prototypes, feed them to a frozen LLM with a Prompt-as-Prefix to guide context, then project the output back to forecasts. This setup claims to outperform specialized time-series models on standard benchmarks while also handling few-shot and zero-shot cases better than expected. The specific pairing of text prototypes with the PaP mechanism is the clearest new element; it differs from prior work that either fine-tunes the LLM or uses simpler patching without the explicit prefix steering. The empirical section covers multiple datasets and reports consistent improvements, which is the main practical takeaway if the numbers hold. The soft spot is the lack of a control that replaces the frozen LLM with a randomly initialized model of identical size while keeping the reprogramming layers and projection the same. Without that, it's difficult to tell how much the pre-trained reasoning is contributing versus the learned input-output mappings. The paper's central claim about modality alignment and transferred capabilities rests on that distinction, yet the experiments do not close it. The rest of the setup looks standard—no circular derivations or invented metrics—and the citations engage the relevant LLM and forecasting literature without obvious omissions. This is worth a serious referee for groups working on cross-modal foundation models or practical forecasting under data limits. A reader focused on deployable methods could extract usable ideas from the benchmarks, but anyone planning to extend the transfer story will need to add the missing ablation themselves.

Referee Report

2 major / 2 minor

Summary. The paper proposes Time-LLM, a reprogramming framework that repurposes frozen large language models for general time series forecasting. Time series inputs are aligned to the LLM via text prototypes, augmented with Prompt-as-Prefix (PaP) to direct reasoning over patches, and projected to forecasts; the LLM backbone remains frozen. The central claim is that this yields a powerful time series learner that outperforms specialized SOTA models on standard benchmarks while also excelling in few-shot and zero-shot settings.

Significance. If the results hold, the work provides evidence that pre-trained LLMs can be effectively aligned to time series without parameter updates, offering a path toward unified foundation models for forecasting and reducing reliance on task-specific architectures. The reported few-shot and zero-shot performance would be a notable strength if substantiated.

major comments (2)

[Experiments] Experiments section: the paper benchmarks Time-LLM against specialized forecasting models but omits the critical control of replacing the frozen LLM with a randomly initialized model of identical architecture and size (keeping reprogramming layers, PaP, and output projection trainable and identical). Without this ablation, it is impossible to isolate whether gains derive from the LLM's pre-trained reasoning or from the learned input/output mappings alone, directly weakening support for the modality-alignment and transfer claim.
[Section 3] Section 3 (Methodology): the description of text-prototype reprogramming and PaP does not provide quantitative verification that temporal structure is preserved without introducing artifacts; the assumption that this alignment successfully transfers LLM capabilities is load-bearing for the central claim yet rests on indirect evidence from end-to-end performance.

minor comments (2)

[Abstract] Abstract: 'natural language process (NLP)' should read 'natural language processing (NLP)'.
[Figures] Ensure all figures (e.g., the framework diagram) include explicit labels for the reprogramming, PaP, and projection stages to improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The suggested ablation and additional verification will help strengthen the evidence for our claims on modality alignment and the role of pre-trained LLMs. We address each major comment below and will incorporate the revisions in the next version of the manuscript.

read point-by-point responses

Referee: [Experiments] Experiments section: the paper benchmarks Time-LLM against specialized forecasting models but omits the critical control of replacing the frozen LLM with a randomly initialized model of identical architecture and size (keeping reprogramming layers, PaP, and output projection trainable and identical). Without this ablation, it is impossible to isolate whether gains derive from the LLM's pre-trained reasoning or from the learned input/output mappings alone, directly weakening support for the modality-alignment and transfer claim.

Authors: We agree this ablation is important to isolate the contribution of the pre-trained LLM. In the revised manuscript, we will add results for a randomly initialized LLM backbone of identical architecture and size, with all reprogramming layers, PaP, and the output projection kept trainable and identical. This control will clarify whether performance gains arise primarily from the pre-trained weights or from the learned input/output components alone. revision: yes
Referee: [Section 3] Section 3 (Methodology): the description of text-prototype reprogramming and PaP does not provide quantitative verification that temporal structure is preserved without introducing artifacts; the assumption that this alignment successfully transfers LLM capabilities is load-bearing for the central claim yet rests on indirect evidence from end-to-end performance.

Authors: We acknowledge that direct quantitative checks would provide stronger support. In the revision, we will augment Section 3 with quantitative verification, including similarity metrics (e.g., cosine similarity and Pearson correlation) between original time-series patches and their reprogrammed representations, as well as reconstruction error analysis to confirm that temporal structure is preserved with minimal artifacts. We will also add a brief discussion of how the text-prototype design and PaP maintain sequential dependencies. revision: yes

Circularity Check

0 steps flagged

Empirical reprogramming framework with no circular derivation

full rationale

The paper describes a practical reprogramming method (text prototypes + Prompt-as-Prefix) whose performance claims rest on external benchmark comparisons rather than any closed-form derivation or fitted parameter renamed as prediction. No equations reduce reported forecasting accuracy to quantities defined solely inside the paper; the central transfer assumption is tested via few-shot/zero-shot results against specialized models. Self-citations, if present, are not load-bearing for the core result.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that LLMs already possess transferable sequence reasoning and that modality alignment via prototypes is feasible; no new physical entities or free parameters are introduced in the abstract description.

axioms (1)

domain assumption Large language models possess robust pattern recognition and reasoning abilities over complex sequences of tokens.
Invoked in the abstract as the foundation for repurposing LLMs to time series.

pith-pipeline@v0.9.0 · 5595 in / 1176 out tokens · 44481 ms · 2026-05-16T15:59:06.403511+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith.Cost.FunctionalEquation washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Our comprehensive evaluations demonstrate that Time-LLM is a powerful time series learner that outperforms state-of-the-art, specialized forecasting models.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 18 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

What if Tomorrow is the World Cup Final? Counterfactual Time Series Forecasting with Textual Conditions
cs.LG 2026-05 unverdicted novelty 7.0

Introduces the task of counterfactual time series forecasting with textual conditions plus a text-attribution mechanism that improves accuracy by distinguishing mutable from immutable factors.
LLaTiSA: Towards Difficulty-Stratified Time Series Reasoning from Visual Perception to Semantics
cs.AI 2026-04 unverdicted novelty 7.0

LLaTiSA is a vision-language model trained on a new 83k-sample hierarchical time series reasoning dataset that shows superior performance and out-of-distribution generalization on stratified TSR tasks.
TimeSeriesExamAgent: Creating Time Series Reasoning Benchmarks at Scale
cs.AI 2026-04 conditional novelty 7.0

TimeSeriesExamAgent combines templates and LLM agents to generate scalable time series reasoning benchmarks, demonstrating that current LLMs have limited performance on both abstract and domain-specific tasks.
Discrete Prototypical Memories for Federated Time Series Foundation Models
cs.LG 2026-04 unverdicted novelty 7.0

FeDPM learns and aligns local discrete prototypical memories across domains to create a unified discrete latent space for LLM-based time series foundation models in a federated setting.
Overcoming the Modality Gap in Context-Aided Forecasting
cs.LG 2026-03 unverdicted novelty 7.0

A semi-synthetic augmentation creates the CAF-7M dataset and demonstrates that improved context data enables multimodal models to outperform unimodal baselines in context-aided forecasting.
TS-Haystack: A Multi-Task Retrieval Benchmark for Long-Context Time-Series Reasoning
cs.LG 2026-02 unverdicted novelty 7.0

TS-Haystack benchmark shows time-series language models degrade sharply on long contexts while an agentic retrieval system using classifier tools matches or beats them on 9 of 10 tasks.
Is Flow Matching Just Trajectory Replay for Sequential Data?
stat.ML 2026-02 unverdicted novelty 7.0

Flow matching on time series targets a closed-form nonparametric velocity field that is a similarity-weighted mixture of observed transition velocities, making neural models approximations to an ideal memory-augmented...
Agent-Based Post-Hoc Correction of Agricultural Yield Forecasts
cs.LG 2026-05 unverdicted novelty 6.0

Structured LLM agents correct agricultural yield forecasts from models like XGBoost, cutting MAE by 20-28% and MASE by up to 66% on strawberry and corn datasets.
GazeMind: A Gaze-Guided LLM Agent for Personalized Cognitive Load Assessment
cs.HC 2026-05 unverdicted novelty 6.0

GazeMind encodes gaze data for LLM reasoning to deliver interpretable, personalized cognitive load predictions that generalize across tasks without fine-tuning and outperform baselines by over 20% on a new 152-person dataset.
Exploring the Potential of Probabilistic Transformer for Time Series Modeling: A Report on the ST-PT Framework
cs.LG 2026-04 unverdicted novelty 6.0

ST-PT turns transformers into explicit factor graphs for time series, enabling structural injection of symbolic priors, per-sample conditional generation, and principled latent autoregressive forecasting via MFVI iterations.
CAARL: In-Context Learning for Interpretable Co-Evolving Time Series Forecasting
cs.LG 2026-04 unverdicted novelty 6.0

CAARL decomposes co-evolving time series into autoregressive segments, builds a temporal dependency graph, serializes it into a narrative, and uses LLMs for interpretable forecasting via chain-of-thought reasoning.
Semantic Communication with an LLM-enabled Knowledge Base
eess.SP 2026-04 unverdicted novelty 6.0

SC-LMKB uses LLM-generated data with cross-domain fusion to cut hallucinations and delivers up to 72.6% gains on cross-modality retrieval tasks over standard semantic communication.
Uncertainty-Aware Foundation Models for Clinical Data
cs.LG 2026-04 unverdicted novelty 6.0

The work introduces uncertainty-aware foundation models for clinical data by learning set-valued patient representations that enforce consistency across partial observations and integrate multimodal self-supervised ob...
Probabilistic NDVI Forecasting from Sparse Satellite Time Series and Weather Covariates
cs.LG 2026-02 conditional novelty 6.0

A neural architecture with a horizon-weighted quantile loss forecasts field-level NDVI from irregular satellite observations and weather covariates, outperforming baselines on European data.
Heterogeneous Scientific Foundation Model Collaboration
cs.AI 2026-04 unverdicted novelty 5.0

Eywa enables language-based agentic AI systems to collaborate with specialized scientific foundation models for improved performance on structured data tasks.
Frozen LLMs as Map-Aware Spatio-Temporal Reasoners for Vehicle Trajectory Prediction
cs.CV 2026-04 unverdicted novelty 5.0

A framework encodes observed trajectories and HD maps into tokens for frozen LLMs to perform spatio-temporal reasoning and predict future vehicle paths with a linear decoder.
MSTN: A Lightweight and Fast Model for General TimeSeries Analysis
cs.LG 2025-11 unverdicted novelty 4.0

MSTN is a lightweight hybrid model that reports new state-of-the-art results on 33 of 40 time series benchmarks for imputation, forecasting, and classification while using under one million parameters and sub-second i...
A Review of Large Language Models for Stock Price Forecasting from a Hedge-Fund Perspective
q-fin.PR 2026-04 unverdicted novelty 3.0

This review synthesizes LLM uses in stock forecasting and catalogs key practical pitfalls from a hedge-fund viewpoint.

Reference graph

Works this paper leans on

112 extracted references · 112 canonical work pages · cited by 18 Pith papers · 4 internal anchors

[1]

Kingma and Jimmy Ba , title =

Diederik P. Kingma and Jimmy Ba , title =. International Conference on Learning Representations , year =

work page
[3]

Advances in Neural Information Processing Systems , volume=

Language models are few-shot learners , author=. Advances in Neural Information Processing Systems , volume=

work page
[4]

IEEE Transactions on Neural Networks and Learning Systems , year=

Cross reconstruction transformer for self-supervised time series representation learning , author=. IEEE Transactions on Neural Networks and Learning Systems , year=

work page
[5]

Advances in Neural Information Processing Systems , volume=

Self-supervised contrastive pre-training for time series via time-frequency consistency , author=. Advances in Neural Information Processing Systems , volume=

work page
[7]

Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining , pages=

A transformer-based framework for multivariate time series representation learning , author=. Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining , pages=

work page
[8]

Proceedings of 25th European Symposium on Artificial Neural Networks , year=

TimeNet: Pre-trained deep recurrent neural network for time series classification , author=. Proceedings of 25th European Symposium on Artificial Neural Networks , year=

work page
[9]

IEEE International Conference on Big Data , pages=

Transfer learning for time series classification , author=. IEEE International Conference on Big Data , pages=. 2018 , organization=

work page 2018
[10]

International Conference on Machine Learning , pages=

Voice2series: Reprogramming acoustic models for time series classification , author=. International Conference on Machine Learning , pages=. 2021 , organization=

work page 2021
[12]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Informer: Beyond efficient transformer for long sequence time-series forecasting , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page
[13]

Advances in Neural Information Processing Systems , volume=

Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting , author=. Advances in Neural Information Processing Systems , volume=

work page
[14]

International Journal of Forecasting , volume=

A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting , author=. International Journal of Forecasting , volume=. 2020 , publisher=

work page 2020
[15]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages=

Deep residual learning for image recognition , author=. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages=

work page
[17]

Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , year=

Bert: Pre-training of deep bidirectional transformers for language understanding , author=. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , year=

work page 2019
[18]

2015 , publisher=

Time series analysis: forecasting and control , author=. 2015 , publisher=

work page 2015
[19]

Reviews of Geophysics , volume=

Climate modeling , author=. Reviews of Geophysics , volume=. 1974 , publisher=

work page 1974
[21]

The 32nd International Joint Conference on Artificial Intelligence , year=

pTSE: A Multi-model Ensemble Method for Probabilistic Time Series Forecasting , author=. The 32nd International Joint Conference on Artificial Intelligence , year=

work page
[22]

Proceedings of the 7th Annual Conference on Robot Learning , year=

Large Language Models as General Pattern Machines , author=. Proceedings of the 7th Annual Conference on Robot Learning , year=

work page
[25]

2023 , eprint=

GPT-4 Technical Report , author=. 2023 , eprint=

work page 2023
[26]

IEEE Transactions on Power Systems , volume=

The time series approach to short term load forecasting , author=. IEEE Transactions on Power Systems , volume=. 1987 , publisher=

work page 1987
[27]

Transfusion , volume=

From demand forecasting to inventory ordering decisions for red blood cells through integrating machine learning, statistical modeling, and inventory optimization , author=. Transfusion , volume=. 2022 , publisher=

work page 2022
[28]

with exhibits , volume=

Promotional analysis and forecasting for demand planning: a practical time series approach , author=. with exhibits , volume=. 2001 , publisher=

work page 2001
[29]

Neural computation , volume=

Long short-term memory , author=. Neural computation , volume=. 1997 , publisher=

work page 1997
[30]

OpenAI blog , volume=

Language models are unsupervised multitask learners , author=. OpenAI blog , volume=

work page
[32]

International Conference on Learning Representations , year=

A Time Series is Worth 64 Words: Long-term Forecasting with Transformers , author=. International Conference on Learning Representations , year=

work page
[33]

arXiv preprint arXiv:2307.13269 , year=

LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition , author=. arXiv preprint arXiv:2307.13269 , year=

work page arXiv
[34]

International Conference on Learning Representations , year=

Reversible instance normalization for accurate time-series forecasting against distribution shift , author=. International Conference on Learning Representations , year=

work page
[36]

A Survey of Large Language Models

A survey of large language models , author=. arXiv preprint arXiv:2303.18223 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[40]

Advances in Neural Information Processing Systems , volume=

Multimodal few-shot learning with frozen language models , author=. Advances in Neural Information Processing Systems , volume=

work page
[41]

Advances in Neural Information Processing Systems , volume=

One Fits All: Power General Time Series Analysis by Pretrained LM , author=. Advances in Neural Information Processing Systems , volume=

work page
[42]

International Conference on Learning Representations , year=

Timesnet: Temporal 2d-variation modeling for general time series analysis , author=. International Conference on Learning Representations , year=

work page
[43]

International Journal of Forecasting , volume=

The M4 Competition: Results, findings, conclusion and way forward , author=. International Journal of Forecasting , volume=. 2018 , publisher=

work page 2018
[45]

International Joint Conference on Artificial Intelligence , year=

Transformers in time series: A survey , author=. International Joint Conference on Artificial Intelligence , year=

work page
[47]

Advances in Neural Information Processing Systems , volume=

Non-stationary transformers: Exploring the stationarity in time series forecasting , author=. Advances in Neural Information Processing Systems , volume=

work page
[48]

International Conference on Machine Learning , pages=

Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting , author=. International Conference on Machine Learning , pages=. 2022 , organization=

work page 2022
[49]

Advances in Neural Information Processing Systems , volume=

Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting , author=. Advances in Neural Information Processing Systems , volume=

work page
[50]

IEEE International Conference on Acoustics, Speech and Signal Processing , year=

SADI: A Self-Adaptive Decomposed Interpretable Framework for Electric Load Forecasting Under Extreme Events , author=. IEEE International Conference on Acoustics, Speech and Signal Processing , year=

work page
[51]

International Conference on Learning Representations , year=

Reformer: The efficient transformer , author=. International Conference on Learning Representations , year=

work page
[53]

Proceedings of the AAAI conference on artificial intelligence , volume=

Are transformers effective for time series forecasting? , author=. Proceedings of the AAAI conference on artificial intelligence , volume=

work page
[54]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

NHITS: neural hierarchical interpolation for time series forecasting , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page
[55]

International Conference on Learning Representations , year=

N-BEATS: Neural basis expansion analysis for interpretable time series forecasting , author=. International Conference on Learning Representations , year=

work page
[56]

Advances in Neural Information Processing Systems , year=

SimMTM: A Simple Pre-Training Framework for Masked Time-Series Modeling , author=. Advances in Neural Information Processing Systems , year=

work page
[57]

Advances in neural information processing systems , volume=

Large language models are zero-shot reasoners , author=. Advances in neural information processing systems , volume=

work page
[58]

Advances in Neural Information Processing Systems , volume=

Flamingo: a visual language model for few-shot learning , author=. Advances in Neural Information Processing Systems , volume=

work page
[59]

Proceedings of the 31st ACM International Conference on Information & Knowledge Management , pages=

Domain adversarial spatial-temporal network: a transferable framework for short-term traffic forecasting across cities , author=. Proceedings of the 31st ACM International Conference on Information & Knowledge Management , pages=

work page
[60]

Advances in Neural Information Processing Systems , year=

Qlora: Efficient finetuning of quantized llms , author=. Advances in Neural Information Processing Systems , year=

work page
[61]

Advances in Neural Information Processing Systems , volume=

Pytorch: An imperative style, high-performance deep learning library , author=. Advances in Neural Information Processing Systems , volume=

work page
[62]

Advances in Neural Information Processing Systems , volume=

Attention is all you need , author=. Advances in Neural Information Processing Systems , volume=

work page
[63]

Proceedings of the AAAI Conference on Artificial Intelligence , year=

N-HiTS: Neural Hierarchical Interpolation for Time Series Forecasting , author=. Proceedings of the AAAI Conference on Artificial Intelligence , year=

work page
[64]

Advances in Neural Information Processing Systems , year=

Large Language Models Are Zero-Shot Time Series Forecasters , author=. Advances in Neural Information Processing Systems , year=

work page
[65]

International journal of forecasting , volume=

The M3-Competition: results, conclusions and implications , author=. International journal of forecasting , volume=. 2000 , publisher=

work page 2000
[66]

Annual Conference on Neural Information Processing Systems , year=

Reprogramming Language Models for Molecular Representation Learning , author=. Annual Conference on Neural Information Processing Systems , year=

work page
[67]

International Conference on Machine Learning , year=

Reprogramming Pretrained Language Models for Antibody Sequence Infilling , author=. International Conference on Machine Learning , year=

work page
[68]

The Journal of Machine Learning Research , volume=

Darts: User-friendly modern machine learning for time series , author=. The Journal of Machine Learning Research , volume=. 2022 , publisher=

work page 2022
[71]

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

Shaojie Bai, J Zico Kolter, and Vladlen Koltun. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[72]

Time series analysis: forecasting and control

George EP Box, Gwilym M Jenkins, Gregory C Reinsel, and Greta M Ljung. Time series analysis: forecasting and control. John Wiley & Sons, 2015

work page 2015
[73]

Language models are few-shot learners

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in Neural Information Processing Systems, 33: 0 1877--1901, 2020

work page 1901
[74]

N-hits: Neural hierarchical interpolation for time series forecasting

Cristian Challu, Kin G Olivares, Boris N Oreshkin, Federico Garza, Max Mergenthaler, and Artur Dubrawski. N-hits: Neural hierarchical interpolation for time series forecasting. Proceedings of the AAAI Conference on Artificial Intelligence, 2023 a

work page 2023
[75]

Nhits: neural hierarchical interpolation for time series forecasting

Cristian Challu, Kin G Olivares, Boris N Oreshkin, Federico Garza Ramirez, Max Mergenthaler Canseco, and Artur Dubrawski. Nhits: neural hierarchical interpolation for time series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pp.\ 6989--6997, 2023 b

work page 2023
[76]

Llm4ts: Two-stage fine-tuning for time-series forecasting with pre-trained llms

Ching Chang, Wen-Chih Peng, and Tien-Fu Chen. Llm4ts: Two-stage fine-tuning for time-series forecasting with pre-trained llms. arXiv preprint arXiv:2308.08469, 2023

work page arXiv 2023
[77]

Model reprogramming: Resource-efficient cross-domain machine learning

Pin-Yu Chen. Model reprogramming: Resource-efficient cross-domain machine learning. arXiv preprint arXiv:2202.10629, 2022

work page arXiv 2022
[78]

Leveraging large language models for pre-trained recommender systems

Zhixuan Chu, Hongyan Hao, Xin Ouyang, Simeng Wang, Yan Wang, Yue Shen, Jinjie Gu, Qing Cui, Longfei Li, Siqiao Xue, et al. Leveraging large language models for pre-trained recommender systems. arXiv preprint arXiv:2308.10837, 2023

work page arXiv 2023
[79]

Beyond just vision: A review on self-supervised representation learning on multimodal and temporal data

Shohreh Deldari, Hao Xue, Aaqib Saeed, Jiayuan He, Daniel V Smith, and Flora D Salim. Beyond just vision: A review on self-supervised representation learning on multimodal and temporal data. arXiv preprint arXiv:2206.02353, 2022

work page arXiv 2022
[80]

Qlora: Efficient finetuning of quantized llms

Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, and Luke Zettlemoyer. Qlora: Efficient finetuning of quantized llms. Advances in Neural Information Processing Systems, 2023

work page 2023
[81]

Bert: Pre-training of deep bidirectional transformers for language understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

work page 2019
[82]

Transfer learning for time series classification

Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, and Pierre-Alain Muller. Transfer learning for time series classification. In IEEE International Conference on Big Data, pp.\ 1367--1376. IEEE, 2018

work page 2018
[83]

Large language models are zero-shot time series forecasters

Nate Gruver, Marc Anton Finzi, Shikai Qiu, and Andrew Gordon Wilson. Large language models are zero-shot time series forecasters. Advances in Neural Information Processing Systems, 2023

work page 2023
[84]

Darts: User-friendly modern machine learning for time series

Julien Herzen, Francesco Lassig, Samuele Giuliano Piazzetta, Thomas Neuer, Leo Tafti, Guillaume Raille, Tomas Van Pottelbergh, Marek Pasieka, Andrzej Skrodzki, Nicolas Huguenin, et al. Darts: User-friendly modern machine learning for time series. The Journal of Machine Learning Research, 23 0 (1): 0 5442--5447, 2022

work page 2022
[85]

Long short-term memory

Sepp Hochreiter and J \"u rgen Schmidhuber. Long short-term memory. Neural computation, 9 0 (8): 0 1735--1780, 1997

work page 1997
[86]

A survey on graph neural networks for time series: Forecasting, classification, imputation, and anomaly detection

Ming Jin, Huan Yee Koh, Qingsong Wen, Daniele Zambon, Cesare Alippi, Geoffrey I Webb, Irwin King, and Shirui Pan. A survey on graph neural networks for time series: Forecasting, classification, imputation, and anomaly detection. arXiv preprint arXiv:2307.03759, 2023 a

work page arXiv 2023
[87]

Large models for time series and spatio-temporal data: A survey and outlook

Ming Jin, Qingsong Wen, Yuxuan Liang, Chaoli Zhang, Siqiao Xue, Xue Wang, James Zhang, Yi Wang, Haifeng Chen, Xiaoli Li, et al. Large models for time series and spatio-temporal data: A survey and outlook. arXiv preprint arXiv:2310.10196, 2023 b

work page arXiv 2023
[88]

Reversible instance normalization for accurate time-series forecasting against distribution shift

Taesung Kim, Jinhee Kim, Yunwon Tae, Cheonbok Park, Jang-Ho Choi, and Jaegul Choo. Reversible instance normalization for accurate time-series forecasting against distribution shift. In International Conference on Learning Representations, 2021

work page 2021
[89]

Kingma and Jimmy Ba

Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. International Conference on Learning Representations, 2015

work page 2015
[90]

Reformer: The efficient transformer

Nikita Kitaev, ukasz Kaiser, and Anselm Levskaya. Reformer: The efficient transformer. In International Conference on Learning Representations, 2020

work page 2020
[91]

Large language models are zero-shot reasoners

Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa. Large language models are zero-shot reasoners. Advances in neural information processing systems, 35: 0 22199--22213, 2022

work page 2022
[92]

Promotional analysis and forecasting for demand planning: a practical time series approach

Michael Leonard. Promotional analysis and forecasting for demand planning: a practical time series approach. with exhibits, 1, 2001

work page 2001
[93]

From demand forecasting to inventory ordering decisions for red blood cells through integrating machine learning, statistical modeling, and inventory optimization

Na Li, Donald M Arnold, Douglas G Down, Rebecca Barty, John Blake, Fei Chiang, Tom Courtney, Marianne Waito, Rick Trifunov, and Nancy M Heddle. From demand forecasting to inventory ordering decisions for red blood cells through integrating machine learning, statistical modeling, and inventory optimization. Transfusion, 62 0 (1): 0 87--99, 2022

work page 2022
[94]

Sadi: A self-adaptive decomposed interpretable framework for electric load forecasting under extreme events

Hengbo Liu, Ziqing Ma, Linxiao Yang, Tian Zhou, Rui Xia, Yi Wang, Qingsong Wen, and Liang Sun. Sadi: A self-adaptive decomposed interpretable framework for electric load forecasting under extreme events. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2023 a

work page 2023
[95]

Large language models are few-shot health learners

Xin Liu, Daniel McDuff, Geza Kovacs, Isaac Galatzer-Levy, Jacob Sunshine, Jiening Zhan, Ming-Zher Poh, Shun Liao, Paolo Di Achille, and Shwetak Patel. Large language models are few-shot health learners. arXiv preprint arXiv:2305.15525, 2023 b

work page arXiv 2023
[96]

Non-stationary transformers: Exploring the stationarity in time series forecasting

Yong Liu, Haixu Wu, Jianmin Wang, and Mingsheng Long. Non-stationary transformers: Exploring the stationarity in time series forecasting. Advances in Neural Information Processing Systems, 35: 0 9881--9893, 2022

work page 2022
[97]

Leveraging speech ptm, text llm, and emotional tts for speech emotion recognition

Ziyang Ma, Wen Wu, Zhisheng Zheng, Yiwei Guo, Qian Chen, Shiliang Zhang, and Xie Chen. Leveraging speech ptm, text llm, and emotional tts for speech emotion recognition. arXiv preprint arXiv:2309.10294, 2023

work page arXiv 2023

Showing first 80 references.