Recognition: 2 theorem links
· Lean TheoremTime-LLM: Time Series Forecasting by Reprogramming Large Language Models
Pith reviewed 2026-05-16 15:59 UTC · model grok-4.3
The pith
Reprogramming time series inputs with text prototypes lets frozen large language models generate accurate forecasts without retraining.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By first mapping time series values to text prototypes and then applying Prompt-as-Prefix enrichment, a frozen large language model can be directly repurposed to produce time series forecasts that surpass current specialized models in both standard and low-data settings.
What carries the argument
The reprogramming step that replaces raw time series patches with learned text prototypes before they enter the frozen LLM, combined with Prompt-as-Prefix to steer the model's patch-level transformations.
If this is right
- Forecasting tasks no longer require training a new architecture from scratch for each application.
- The same frozen LLM can serve both natural language and time series inputs after the reprogramming step.
- Few-shot and zero-shot performance improves because the LLM already encodes broad sequence patterns from its original training.
- Model updates can focus only on the lightweight reprogramming and projection layers rather than the full backbone.
Where Pith is reading between the lines
- The same text-prototype alignment might extend to other ordered data such as audio waveforms or sensor streams without changing the LLM.
- Freezing the backbone could lower the compute needed to adapt forecasting systems to new domains.
- Different choices of text prototypes could be tested to see which best retain fine-grained timing information.
Load-bearing premise
The conversion of time series patches into text prototypes plus the added prompt preserves enough temporal structure for the LLM's existing reasoning to apply without major distortion or artifact.
What would settle it
On a benchmark containing strong periodic or trending series, the reprogrammed LLM produces forecasts whose error pattern matches or exceeds that of a simple linear trend model rather than capturing the periodicity.
read the original abstract
Time series forecasting holds significant importance in many real-world dynamic systems and has been extensively studied. Unlike natural language process (NLP) and computer vision (CV), where a single large model can tackle multiple tasks, models for time series forecasting are often specialized, necessitating distinct designs for different tasks and applications. While pre-trained foundation models have made impressive strides in NLP and CV, their development in time series domains has been constrained by data sparsity. Recent studies have revealed that large language models (LLMs) possess robust pattern recognition and reasoning abilities over complex sequences of tokens. However, the challenge remains in effectively aligning the modalities of time series data and natural language to leverage these capabilities. In this work, we present Time-LLM, a reprogramming framework to repurpose LLMs for general time series forecasting with the backbone language models kept intact. We begin by reprogramming the input time series with text prototypes before feeding it into the frozen LLM to align the two modalities. To augment the LLM's ability to reason with time series data, we propose Prompt-as-Prefix (PaP), which enriches the input context and directs the transformation of reprogrammed input patches. The transformed time series patches from the LLM are finally projected to obtain the forecasts. Our comprehensive evaluations demonstrate that Time-LLM is a powerful time series learner that outperforms state-of-the-art, specialized forecasting models. Moreover, Time-LLM excels in both few-shot and zero-shot learning scenarios.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes Time-LLM, a reprogramming framework that repurposes frozen large language models for general time series forecasting. Time series inputs are aligned to the LLM via text prototypes, augmented with Prompt-as-Prefix (PaP) to direct reasoning over patches, and projected to forecasts; the LLM backbone remains frozen. The central claim is that this yields a powerful time series learner that outperforms specialized SOTA models on standard benchmarks while also excelling in few-shot and zero-shot settings.
Significance. If the results hold, the work provides evidence that pre-trained LLMs can be effectively aligned to time series without parameter updates, offering a path toward unified foundation models for forecasting and reducing reliance on task-specific architectures. The reported few-shot and zero-shot performance would be a notable strength if substantiated.
major comments (2)
- [Experiments] Experiments section: the paper benchmarks Time-LLM against specialized forecasting models but omits the critical control of replacing the frozen LLM with a randomly initialized model of identical architecture and size (keeping reprogramming layers, PaP, and output projection trainable and identical). Without this ablation, it is impossible to isolate whether gains derive from the LLM's pre-trained reasoning or from the learned input/output mappings alone, directly weakening support for the modality-alignment and transfer claim.
- [Section 3] Section 3 (Methodology): the description of text-prototype reprogramming and PaP does not provide quantitative verification that temporal structure is preserved without introducing artifacts; the assumption that this alignment successfully transfers LLM capabilities is load-bearing for the central claim yet rests on indirect evidence from end-to-end performance.
minor comments (2)
- [Abstract] Abstract: 'natural language process (NLP)' should read 'natural language processing (NLP)'.
- [Figures] Ensure all figures (e.g., the framework diagram) include explicit labels for the reprogramming, PaP, and projection stages to improve readability.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. The suggested ablation and additional verification will help strengthen the evidence for our claims on modality alignment and the role of pre-trained LLMs. We address each major comment below and will incorporate the revisions in the next version of the manuscript.
read point-by-point responses
-
Referee: [Experiments] Experiments section: the paper benchmarks Time-LLM against specialized forecasting models but omits the critical control of replacing the frozen LLM with a randomly initialized model of identical architecture and size (keeping reprogramming layers, PaP, and output projection trainable and identical). Without this ablation, it is impossible to isolate whether gains derive from the LLM's pre-trained reasoning or from the learned input/output mappings alone, directly weakening support for the modality-alignment and transfer claim.
Authors: We agree this ablation is important to isolate the contribution of the pre-trained LLM. In the revised manuscript, we will add results for a randomly initialized LLM backbone of identical architecture and size, with all reprogramming layers, PaP, and the output projection kept trainable and identical. This control will clarify whether performance gains arise primarily from the pre-trained weights or from the learned input/output components alone. revision: yes
-
Referee: [Section 3] Section 3 (Methodology): the description of text-prototype reprogramming and PaP does not provide quantitative verification that temporal structure is preserved without introducing artifacts; the assumption that this alignment successfully transfers LLM capabilities is load-bearing for the central claim yet rests on indirect evidence from end-to-end performance.
Authors: We acknowledge that direct quantitative checks would provide stronger support. In the revision, we will augment Section 3 with quantitative verification, including similarity metrics (e.g., cosine similarity and Pearson correlation) between original time-series patches and their reprogrammed representations, as well as reconstruction error analysis to confirm that temporal structure is preserved with minimal artifacts. We will also add a brief discussion of how the text-prototype design and PaP maintain sequential dependencies. revision: yes
Circularity Check
Empirical reprogramming framework with no circular derivation
full rationale
The paper describes a practical reprogramming method (text prototypes + Prompt-as-Prefix) whose performance claims rest on external benchmark comparisons rather than any closed-form derivation or fitted parameter renamed as prediction. No equations reduce reported forecasting accuracy to quantities defined solely inside the paper; the central transfer assumption is tested via few-shot/zero-shot results against specialized models. Self-citations, if present, are not load-bearing for the core result.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Large language models possess robust pattern recognition and reasoning abilities over complex sequences of tokens.
Lean theorems connected to this paper
-
IndisputableMonolith.Cost.FunctionalEquationwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Our comprehensive evaluations demonstrate that Time-LLM is a powerful time series learner that outperforms state-of-the-art, specialized forecasting models.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 18 Pith papers
-
What if Tomorrow is the World Cup Final? Counterfactual Time Series Forecasting with Textual Conditions
Introduces the task of counterfactual time series forecasting with textual conditions plus a text-attribution mechanism that improves accuracy by distinguishing mutable from immutable factors.
-
LLaTiSA: Towards Difficulty-Stratified Time Series Reasoning from Visual Perception to Semantics
LLaTiSA is a vision-language model trained on a new 83k-sample hierarchical time series reasoning dataset that shows superior performance and out-of-distribution generalization on stratified TSR tasks.
-
TimeSeriesExamAgent: Creating Time Series Reasoning Benchmarks at Scale
TimeSeriesExamAgent combines templates and LLM agents to generate scalable time series reasoning benchmarks, demonstrating that current LLMs have limited performance on both abstract and domain-specific tasks.
-
Discrete Prototypical Memories for Federated Time Series Foundation Models
FeDPM learns and aligns local discrete prototypical memories across domains to create a unified discrete latent space for LLM-based time series foundation models in a federated setting.
-
Overcoming the Modality Gap in Context-Aided Forecasting
A semi-synthetic augmentation creates the CAF-7M dataset and demonstrates that improved context data enables multimodal models to outperform unimodal baselines in context-aided forecasting.
-
TS-Haystack: A Multi-Task Retrieval Benchmark for Long-Context Time-Series Reasoning
TS-Haystack benchmark shows time-series language models degrade sharply on long contexts while an agentic retrieval system using classifier tools matches or beats them on 9 of 10 tasks.
-
Is Flow Matching Just Trajectory Replay for Sequential Data?
Flow matching on time series targets a closed-form nonparametric velocity field that is a similarity-weighted mixture of observed transition velocities, making neural models approximations to an ideal memory-augmented...
-
Agent-Based Post-Hoc Correction of Agricultural Yield Forecasts
Structured LLM agents correct agricultural yield forecasts from models like XGBoost, cutting MAE by 20-28% and MASE by up to 66% on strawberry and corn datasets.
-
GazeMind: A Gaze-Guided LLM Agent for Personalized Cognitive Load Assessment
GazeMind encodes gaze data for LLM reasoning to deliver interpretable, personalized cognitive load predictions that generalize across tasks without fine-tuning and outperform baselines by over 20% on a new 152-person dataset.
-
Exploring the Potential of Probabilistic Transformer for Time Series Modeling: A Report on the ST-PT Framework
ST-PT turns transformers into explicit factor graphs for time series, enabling structural injection of symbolic priors, per-sample conditional generation, and principled latent autoregressive forecasting via MFVI iterations.
-
CAARL: In-Context Learning for Interpretable Co-Evolving Time Series Forecasting
CAARL decomposes co-evolving time series into autoregressive segments, builds a temporal dependency graph, serializes it into a narrative, and uses LLMs for interpretable forecasting via chain-of-thought reasoning.
-
Semantic Communication with an LLM-enabled Knowledge Base
SC-LMKB uses LLM-generated data with cross-domain fusion to cut hallucinations and delivers up to 72.6% gains on cross-modality retrieval tasks over standard semantic communication.
-
Uncertainty-Aware Foundation Models for Clinical Data
The work introduces uncertainty-aware foundation models for clinical data by learning set-valued patient representations that enforce consistency across partial observations and integrate multimodal self-supervised ob...
-
Probabilistic NDVI Forecasting from Sparse Satellite Time Series and Weather Covariates
A neural architecture with a horizon-weighted quantile loss forecasts field-level NDVI from irregular satellite observations and weather covariates, outperforming baselines on European data.
-
Heterogeneous Scientific Foundation Model Collaboration
Eywa enables language-based agentic AI systems to collaborate with specialized scientific foundation models for improved performance on structured data tasks.
-
Frozen LLMs as Map-Aware Spatio-Temporal Reasoners for Vehicle Trajectory Prediction
A framework encodes observed trajectories and HD maps into tokens for frozen LLMs to perform spatio-temporal reasoning and predict future vehicle paths with a linear decoder.
-
MSTN: A Lightweight and Fast Model for General TimeSeries Analysis
MSTN is a lightweight hybrid model that reports new state-of-the-art results on 33 of 40 time series benchmarks for imputation, forecasting, and classification while using under one million parameters and sub-second i...
-
A Review of Large Language Models for Stock Price Forecasting from a Hedge-Fund Perspective
This review synthesizes LLM uses in stock forecasting and catalogs key practical pitfalls from a hedge-fund viewpoint.
Reference graph
Works this paper leans on
-
[1]
Diederik P. Kingma and Jimmy Ba , title =. International Conference on Learning Representations , year =
-
[3]
Advances in Neural Information Processing Systems , volume=
Language models are few-shot learners , author=. Advances in Neural Information Processing Systems , volume=
-
[4]
IEEE Transactions on Neural Networks and Learning Systems , year=
Cross reconstruction transformer for self-supervised time series representation learning , author=. IEEE Transactions on Neural Networks and Learning Systems , year=
-
[5]
Advances in Neural Information Processing Systems , volume=
Self-supervised contrastive pre-training for time series via time-frequency consistency , author=. Advances in Neural Information Processing Systems , volume=
-
[7]
Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining , pages=
A transformer-based framework for multivariate time series representation learning , author=. Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining , pages=
-
[8]
Proceedings of 25th European Symposium on Artificial Neural Networks , year=
TimeNet: Pre-trained deep recurrent neural network for time series classification , author=. Proceedings of 25th European Symposium on Artificial Neural Networks , year=
-
[9]
IEEE International Conference on Big Data , pages=
Transfer learning for time series classification , author=. IEEE International Conference on Big Data , pages=. 2018 , organization=
work page 2018
-
[10]
International Conference on Machine Learning , pages=
Voice2series: Reprogramming acoustic models for time series classification , author=. International Conference on Machine Learning , pages=. 2021 , organization=
work page 2021
-
[12]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Informer: Beyond efficient transformer for long sequence time-series forecasting , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
-
[13]
Advances in Neural Information Processing Systems , volume=
Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting , author=. Advances in Neural Information Processing Systems , volume=
-
[14]
International Journal of Forecasting , volume=
A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting , author=. International Journal of Forecasting , volume=. 2020 , publisher=
work page 2020
-
[15]
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages=
Deep residual learning for image recognition , author=. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages=
-
[17]
Bert: Pre-training of deep bidirectional transformers for language understanding , author=. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , year=
work page 2019
-
[18]
Time series analysis: forecasting and control , author=. 2015 , publisher=
work page 2015
-
[19]
Reviews of Geophysics , volume=
Climate modeling , author=. Reviews of Geophysics , volume=. 1974 , publisher=
work page 1974
-
[21]
The 32nd International Joint Conference on Artificial Intelligence , year=
pTSE: A Multi-model Ensemble Method for Probabilistic Time Series Forecasting , author=. The 32nd International Joint Conference on Artificial Intelligence , year=
-
[22]
Proceedings of the 7th Annual Conference on Robot Learning , year=
Large Language Models as General Pattern Machines , author=. Proceedings of the 7th Annual Conference on Robot Learning , year=
- [25]
-
[26]
IEEE Transactions on Power Systems , volume=
The time series approach to short term load forecasting , author=. IEEE Transactions on Power Systems , volume=. 1987 , publisher=
work page 1987
-
[27]
From demand forecasting to inventory ordering decisions for red blood cells through integrating machine learning, statistical modeling, and inventory optimization , author=. Transfusion , volume=. 2022 , publisher=
work page 2022
-
[28]
Promotional analysis and forecasting for demand planning: a practical time series approach , author=. with exhibits , volume=. 2001 , publisher=
work page 2001
-
[29]
Long short-term memory , author=. Neural computation , volume=. 1997 , publisher=
work page 1997
-
[30]
Language models are unsupervised multitask learners , author=. OpenAI blog , volume=
-
[32]
International Conference on Learning Representations , year=
A Time Series is Worth 64 Words: Long-term Forecasting with Transformers , author=. International Conference on Learning Representations , year=
-
[33]
arXiv preprint arXiv:2307.13269 , year=
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition , author=. arXiv preprint arXiv:2307.13269 , year=
-
[34]
International Conference on Learning Representations , year=
Reversible instance normalization for accurate time-series forecasting against distribution shift , author=. International Conference on Learning Representations , year=
-
[36]
A Survey of Large Language Models
A survey of large language models , author=. arXiv preprint arXiv:2303.18223 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[40]
Advances in Neural Information Processing Systems , volume=
Multimodal few-shot learning with frozen language models , author=. Advances in Neural Information Processing Systems , volume=
-
[41]
Advances in Neural Information Processing Systems , volume=
One Fits All: Power General Time Series Analysis by Pretrained LM , author=. Advances in Neural Information Processing Systems , volume=
-
[42]
International Conference on Learning Representations , year=
Timesnet: Temporal 2d-variation modeling for general time series analysis , author=. International Conference on Learning Representations , year=
-
[43]
International Journal of Forecasting , volume=
The M4 Competition: Results, findings, conclusion and way forward , author=. International Journal of Forecasting , volume=. 2018 , publisher=
work page 2018
-
[45]
International Joint Conference on Artificial Intelligence , year=
Transformers in time series: A survey , author=. International Joint Conference on Artificial Intelligence , year=
-
[47]
Advances in Neural Information Processing Systems , volume=
Non-stationary transformers: Exploring the stationarity in time series forecasting , author=. Advances in Neural Information Processing Systems , volume=
-
[48]
International Conference on Machine Learning , pages=
Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting , author=. International Conference on Machine Learning , pages=. 2022 , organization=
work page 2022
-
[49]
Advances in Neural Information Processing Systems , volume=
Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting , author=. Advances in Neural Information Processing Systems , volume=
-
[50]
IEEE International Conference on Acoustics, Speech and Signal Processing , year=
SADI: A Self-Adaptive Decomposed Interpretable Framework for Electric Load Forecasting Under Extreme Events , author=. IEEE International Conference on Acoustics, Speech and Signal Processing , year=
-
[51]
International Conference on Learning Representations , year=
Reformer: The efficient transformer , author=. International Conference on Learning Representations , year=
-
[53]
Proceedings of the AAAI conference on artificial intelligence , volume=
Are transformers effective for time series forecasting? , author=. Proceedings of the AAAI conference on artificial intelligence , volume=
-
[54]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
NHITS: neural hierarchical interpolation for time series forecasting , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
-
[55]
International Conference on Learning Representations , year=
N-BEATS: Neural basis expansion analysis for interpretable time series forecasting , author=. International Conference on Learning Representations , year=
-
[56]
Advances in Neural Information Processing Systems , year=
SimMTM: A Simple Pre-Training Framework for Masked Time-Series Modeling , author=. Advances in Neural Information Processing Systems , year=
-
[57]
Advances in neural information processing systems , volume=
Large language models are zero-shot reasoners , author=. Advances in neural information processing systems , volume=
-
[58]
Advances in Neural Information Processing Systems , volume=
Flamingo: a visual language model for few-shot learning , author=. Advances in Neural Information Processing Systems , volume=
-
[59]
Proceedings of the 31st ACM International Conference on Information & Knowledge Management , pages=
Domain adversarial spatial-temporal network: a transferable framework for short-term traffic forecasting across cities , author=. Proceedings of the 31st ACM International Conference on Information & Knowledge Management , pages=
-
[60]
Advances in Neural Information Processing Systems , year=
Qlora: Efficient finetuning of quantized llms , author=. Advances in Neural Information Processing Systems , year=
-
[61]
Advances in Neural Information Processing Systems , volume=
Pytorch: An imperative style, high-performance deep learning library , author=. Advances in Neural Information Processing Systems , volume=
-
[62]
Advances in Neural Information Processing Systems , volume=
Attention is all you need , author=. Advances in Neural Information Processing Systems , volume=
-
[63]
Proceedings of the AAAI Conference on Artificial Intelligence , year=
N-HiTS: Neural Hierarchical Interpolation for Time Series Forecasting , author=. Proceedings of the AAAI Conference on Artificial Intelligence , year=
-
[64]
Advances in Neural Information Processing Systems , year=
Large Language Models Are Zero-Shot Time Series Forecasters , author=. Advances in Neural Information Processing Systems , year=
-
[65]
International journal of forecasting , volume=
The M3-Competition: results, conclusions and implications , author=. International journal of forecasting , volume=. 2000 , publisher=
work page 2000
-
[66]
Annual Conference on Neural Information Processing Systems , year=
Reprogramming Language Models for Molecular Representation Learning , author=. Annual Conference on Neural Information Processing Systems , year=
-
[67]
International Conference on Machine Learning , year=
Reprogramming Pretrained Language Models for Antibody Sequence Infilling , author=. International Conference on Machine Learning , year=
-
[68]
The Journal of Machine Learning Research , volume=
Darts: User-friendly modern machine learning for time series , author=. The Journal of Machine Learning Research , volume=. 2022 , publisher=
work page 2022
-
[71]
An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
Shaojie Bai, J Zico Kolter, and Vladlen Koltun. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[72]
Time series analysis: forecasting and control
George EP Box, Gwilym M Jenkins, Gregory C Reinsel, and Greta M Ljung. Time series analysis: forecasting and control. John Wiley & Sons, 2015
work page 2015
-
[73]
Language models are few-shot learners
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in Neural Information Processing Systems, 33: 0 1877--1901, 2020
work page 1901
-
[74]
N-hits: Neural hierarchical interpolation for time series forecasting
Cristian Challu, Kin G Olivares, Boris N Oreshkin, Federico Garza, Max Mergenthaler, and Artur Dubrawski. N-hits: Neural hierarchical interpolation for time series forecasting. Proceedings of the AAAI Conference on Artificial Intelligence, 2023 a
work page 2023
-
[75]
Nhits: neural hierarchical interpolation for time series forecasting
Cristian Challu, Kin G Olivares, Boris N Oreshkin, Federico Garza Ramirez, Max Mergenthaler Canseco, and Artur Dubrawski. Nhits: neural hierarchical interpolation for time series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pp.\ 6989--6997, 2023 b
work page 2023
-
[76]
Llm4ts: Two-stage fine-tuning for time-series forecasting with pre-trained llms
Ching Chang, Wen-Chih Peng, and Tien-Fu Chen. Llm4ts: Two-stage fine-tuning for time-series forecasting with pre-trained llms. arXiv preprint arXiv:2308.08469, 2023
-
[77]
Model reprogramming: Resource-efficient cross-domain machine learning
Pin-Yu Chen. Model reprogramming: Resource-efficient cross-domain machine learning. arXiv preprint arXiv:2202.10629, 2022
-
[78]
Leveraging large language models for pre-trained recommender systems
Zhixuan Chu, Hongyan Hao, Xin Ouyang, Simeng Wang, Yan Wang, Yue Shen, Jinjie Gu, Qing Cui, Longfei Li, Siqiao Xue, et al. Leveraging large language models for pre-trained recommender systems. arXiv preprint arXiv:2308.10837, 2023
-
[79]
Shohreh Deldari, Hao Xue, Aaqib Saeed, Jiayuan He, Daniel V Smith, and Flora D Salim. Beyond just vision: A review on self-supervised representation learning on multimodal and temporal data. arXiv preprint arXiv:2206.02353, 2022
-
[80]
Qlora: Efficient finetuning of quantized llms
Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, and Luke Zettlemoyer. Qlora: Efficient finetuning of quantized llms. Advances in Neural Information Processing Systems, 2023
work page 2023
-
[81]
Bert: Pre-training of deep bidirectional transformers for language understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018
work page 2019
-
[82]
Transfer learning for time series classification
Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, and Pierre-Alain Muller. Transfer learning for time series classification. In IEEE International Conference on Big Data, pp.\ 1367--1376. IEEE, 2018
work page 2018
-
[83]
Large language models are zero-shot time series forecasters
Nate Gruver, Marc Anton Finzi, Shikai Qiu, and Andrew Gordon Wilson. Large language models are zero-shot time series forecasters. Advances in Neural Information Processing Systems, 2023
work page 2023
-
[84]
Darts: User-friendly modern machine learning for time series
Julien Herzen, Francesco Lassig, Samuele Giuliano Piazzetta, Thomas Neuer, Leo Tafti, Guillaume Raille, Tomas Van Pottelbergh, Marek Pasieka, Andrzej Skrodzki, Nicolas Huguenin, et al. Darts: User-friendly modern machine learning for time series. The Journal of Machine Learning Research, 23 0 (1): 0 5442--5447, 2022
work page 2022
-
[85]
Sepp Hochreiter and J \"u rgen Schmidhuber. Long short-term memory. Neural computation, 9 0 (8): 0 1735--1780, 1997
work page 1997
-
[86]
Ming Jin, Huan Yee Koh, Qingsong Wen, Daniele Zambon, Cesare Alippi, Geoffrey I Webb, Irwin King, and Shirui Pan. A survey on graph neural networks for time series: Forecasting, classification, imputation, and anomaly detection. arXiv preprint arXiv:2307.03759, 2023 a
-
[87]
Large models for time series and spatio-temporal data: A survey and outlook
Ming Jin, Qingsong Wen, Yuxuan Liang, Chaoli Zhang, Siqiao Xue, Xue Wang, James Zhang, Yi Wang, Haifeng Chen, Xiaoli Li, et al. Large models for time series and spatio-temporal data: A survey and outlook. arXiv preprint arXiv:2310.10196, 2023 b
-
[88]
Reversible instance normalization for accurate time-series forecasting against distribution shift
Taesung Kim, Jinhee Kim, Yunwon Tae, Cheonbok Park, Jang-Ho Choi, and Jaegul Choo. Reversible instance normalization for accurate time-series forecasting against distribution shift. In International Conference on Learning Representations, 2021
work page 2021
-
[89]
Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. International Conference on Learning Representations, 2015
work page 2015
-
[90]
Reformer: The efficient transformer
Nikita Kitaev, ukasz Kaiser, and Anselm Levskaya. Reformer: The efficient transformer. In International Conference on Learning Representations, 2020
work page 2020
-
[91]
Large language models are zero-shot reasoners
Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa. Large language models are zero-shot reasoners. Advances in neural information processing systems, 35: 0 22199--22213, 2022
work page 2022
-
[92]
Promotional analysis and forecasting for demand planning: a practical time series approach
Michael Leonard. Promotional analysis and forecasting for demand planning: a practical time series approach. with exhibits, 1, 2001
work page 2001
-
[93]
Na Li, Donald M Arnold, Douglas G Down, Rebecca Barty, John Blake, Fei Chiang, Tom Courtney, Marianne Waito, Rick Trifunov, and Nancy M Heddle. From demand forecasting to inventory ordering decisions for red blood cells through integrating machine learning, statistical modeling, and inventory optimization. Transfusion, 62 0 (1): 0 87--99, 2022
work page 2022
-
[94]
Hengbo Liu, Ziqing Ma, Linxiao Yang, Tian Zhou, Rui Xia, Yi Wang, Qingsong Wen, and Liang Sun. Sadi: A self-adaptive decomposed interpretable framework for electric load forecasting under extreme events. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2023 a
work page 2023
-
[95]
Large language models are few-shot health learners
Xin Liu, Daniel McDuff, Geza Kovacs, Isaac Galatzer-Levy, Jacob Sunshine, Jiening Zhan, Ming-Zher Poh, Shun Liao, Paolo Di Achille, and Shwetak Patel. Large language models are few-shot health learners. arXiv preprint arXiv:2305.15525, 2023 b
-
[96]
Non-stationary transformers: Exploring the stationarity in time series forecasting
Yong Liu, Haixu Wu, Jianmin Wang, and Mingsheng Long. Non-stationary transformers: Exploring the stationarity in time series forecasting. Advances in Neural Information Processing Systems, 35: 0 9881--9893, 2022
work page 2022
-
[97]
Leveraging speech ptm, text llm, and emotional tts for speech emotion recognition
Ziyang Ma, Wen Wu, Zhisheng Zheng, Yiwei Guo, Qian Chen, Shiliang Zhang, and Xie Chen. Leveraging speech ptm, text llm, and emotional tts for speech emotion recognition. arXiv preprint arXiv:2309.10294, 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.