PDFTime reformulates multivariate time series classification as a multi-stage prototype-based decision process, claiming SOTA results on UCR and UEA benchmarks.
hub Mixed citations
A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
Mixed citation behavior. Most common role is background (64%).
abstract
We propose an efficient design of Transformer-based models for multivariate time series forecasting and self-supervised representation learning. It is based on two key components: (i) segmentation of time series into subseries-level patches which are served as input tokens to Transformer; (ii) channel-independence where each channel contains a single univariate time series that shares the same embedding and Transformer weights across all the series. Patching design naturally has three-fold benefit: local semantic information is retained in the embedding; computation and memory usage of the attention maps are quadratically reduced given the same look-back window; and the model can attend longer history. Our channel-independent patch time series Transformer (PatchTST) can improve the long-term forecasting accuracy significantly when compared with that of SOTA Transformer-based models. We also apply our model to self-supervised pre-training tasks and attain excellent fine-tuning performance, which outperforms supervised training on large datasets. Transferring of masked pre-trained representation on one dataset to others also produces SOTA forecasting accuracy. Code is available at: https://github.com/yuqinie98/PatchTST.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
Olivia harmonizes time series datasets via normalized power spectral density using a Harmonizer module and resonator-based HarmonicAttention, achieving state-of-the-art zero-shot, few-shot, and full-shot forecasting on TSLib, GIFT-Eval, and GluonTS benchmarks.
Introduces the 1GC-7RC benchmark to evaluate AI coding agents on seven diverse ML tasks under single-GPU time and access constraints.
Looped SSMs with shared parameters across depth match or exceed standard SSMs with more parameters on time series classification, with additional gains from input reshaping techniques.
SeesawNet dynamically balances common and instance-specific dependencies via ASNA in temporal and channel dimensions, outperforming prior methods on non-stationary forecasting benchmarks.
Introduces the task of counterfactual time series forecasting with textual conditions plus a text-attribution mechanism that improves accuracy by distinguishing mutable from immutable factors.
FactoryBench reveals that frontier LLMs achieve under 50% on structured causal questions and under 18% on decision-making in industrial robotic telemetry.
MELO aggregates base predictors and their multi-scale EWLS adaptations using MLpol to achieve oracle inequalities against best fixed and time-varying predictors in non-stationary settings.
Synthetic data augmentation helps channel-mixing time series models but degrades channel-independent ones, with reliable gains only from seasonal-trend generators and gradual schedules in low-resource settings.
FeDPM learns and aligns local discrete prototypical memories across domains to create a unified discrete latent space for LLM-based time series foundation models in a federated setting.
CalM uses a discrete tokenizer and dual-axis autoregressive transformer pretrained self-supervised on calcium traces to outperform specialized baselines on population dynamics forecasting and adapt to superior behavior decoding.
LatentTSF improves time series forecasting accuracy and representation quality by shifting prediction from observation space to a learned latent state space via autoencoding.
Sundial uses TimeFlow Loss for native pre-training of Transformers on continuous time series from TimeBench, achieving SOTA point and probabilistic forecasting with millisecond inference.
Toto 2.0 is a family of open time series foundation models that demonstrates reliable scaling and sets new state-of-the-art results on three forecasting benchmarks.
DAD4TS augments small time-series datasets with a diffusion model trained via mathematical geometric projections and guided by reinforcement learning to improve forecasting accuracy.
Empirical scaling study of ECG models finds SSL scales robustly while ResNets show 1.3-2.5x better parameter efficiency and SSL up to 16x better data efficiency than supervised baselines on out-of-distribution tasks.
MarsTSC is a VLM agentic system with generator, reflector, and modifier roles that iteratively refines a knowledge bank to improve few-shot multimodal time series classification and produce human-readable explanations.
MS-FLOW uses a capacity-limited sparse routing mechanism to model only critical inter-variable dependencies in time series data, achieving state-of-the-art accuracy on 12 benchmarks with fewer but more reliable connections.
ST-PT turns transformers into explicit factor graphs for time series, enabling structural injection of symbolic priors, per-sample conditional generation, and principled latent autoregressive forecasting via MFVI iterations.
CAARL decomposes co-evolving time series into autoregressive segments, builds a temporal dependency graph, serializes it into a narrative, and uses LLMs for interpretable forecasting via chain-of-thought reasoning.
M3R improves localized rainfall nowcasting by using weather station time series as queries in multimodal attention to selectively extract precipitation patterns from radar imagery.
TS2TC combines cross-temporal fusion generative anchor pretraining with dual-process transfer to achieve 2.49% lower RMSE than prior methods on PPG parameter estimation using only 10% labeled data.
AgriPriceBD dataset of 1779 daily prices released; naive persistence outperforms deep models like Informer and Time2Vec-Transformer on heterogeneous Bangladeshi commodity series with statistical validation.
iAmTime is a time-series foundation model that uses instruction-conditioned in-context learning from demonstrations to perform zero-shot adaptation on forecasting, imputation, classification, and related tasks.
citing papers explorer
-
Prototype-Guided Classification Sub-Task Decoupling Framework: Enhancing Generalization and Interpretability for Multivariate Time Series
PDFTime reformulates multivariate time series classification as a multi-stage prototype-based decision process, claiming SOTA results on UCR and UEA benchmarks.
-
Olivia: Harmonizing Time Series Foundation Models with Power Spectral Density
Olivia harmonizes time series datasets via normalized power spectral density using a Harmonizer module and resonator-based HarmonicAttention, achieving state-of-the-art zero-shot, few-shot, and full-shot forecasting on TSLib, GIFT-Eval, and GluonTS benchmarks.
-
1GC-7RC: One Graphic Card -- Seven Research Challenges! How Good Are AI Agents at Doing Your Job?
Introduces the 1GC-7RC benchmark to evaluate AI coding agents on seven diverse ML tasks under single-GPU time and access constraints.
-
Looped SSMs: Depth-Recurrence and Input Reshaping for Time Series Classification
Looped SSMs with shared parameters across depth match or exceed standard SSMs with more parameters on time series classification, with additional gains from input reshaping techniques.
-
SeesawNet: Towards Non-stationary Time Series Forecasting with Balanced Modeling of Common and Specific Dependencies
SeesawNet dynamically balances common and instance-specific dependencies via ASNA in temporal and channel dimensions, outperforming prior methods on non-stationary forecasting benchmarks.
-
What if Tomorrow is the World Cup Final? Counterfactual Time Series Forecasting with Textual Conditions
Introduces the task of counterfactual time series forecasting with textual conditions plus a text-attribution mechanism that improves accuracy by distinguishing mutable from immutable factors.
-
FactoryBench: Evaluating Industrial Machine Understanding
FactoryBench reveals that frontier LLMs achieve under 50% on structured causal questions and under 18% on decision-making in industrial robotic telemetry.
-
Hedging Memory Horizons for Non-Stationary Prediction via Online Aggregation
MELO aggregates base predictors and their multi-scale EWLS adaptations using MLpol to achieve oracle inequalities against best fixed and time-varying predictors in non-stationary settings.
-
Does Synthetic Data Help? Empirical Evidence from Deep Learning Time Series Forecasters
Synthetic data augmentation helps channel-mixing time series models but degrades channel-independent ones, with reliable gains only from seasonal-trend generators and gradual schedules in low-resource settings.
-
Discrete Prototypical Memories for Federated Time Series Foundation Models
FeDPM learns and aligns local discrete prototypical memories across domains to create a unified discrete latent space for LLM-based time series foundation models in a federated setting.
-
Self-Supervised Foundation Model for Calcium-imaging Population Dynamics
CalM uses a discrete tokenizer and dual-axis autoregressive transformer pretrained self-supervised on calcium traces to outperform specialized baselines on population dynamics forecasting and adapt to superior behavior decoding.
-
From Observations to States: Latent Time Series Forecasting
LatentTSF improves time series forecasting accuracy and representation quality by shifting prediction from observation space to a learned latent state space via autoencoding.
-
Sundial: A Family of Highly Capable Time Series Foundation Models
Sundial uses TimeFlow Loss for native pre-training of Transformers on continuous time series from TimeBench, achieving SOTA point and probabilistic forecasting with millisecond inference.
-
Toto 2.0: Time Series Forecasting Enters the Scaling Era
Toto 2.0 is a family of open time series foundation models that demonstrates reliable scaling and sets new state-of-the-art results on three forecasting benchmarks.
-
DAD4TS: Data-Augmentation-Oriented Diffusion Model for Time-Series Forecasting with Small-Scale Data
DAD4TS augments small time-series datasets with a diffusion model trained via mathematical geometric projections and guided by reinforcement learning to improve forecasting accuracy.
-
How Do Electrocardiogram Models Scale?
Empirical scaling study of ECG models finds SSL scales robustly while ResNets show 1.3-2.5x better parameter efficiency and SSL up to 16x better data efficiency than supervised baselines on out-of-distribution tasks.
-
Empowering VLMs for Few-Shot Multimodal Time Series Classification via Tailored Agentic Reasoning
MarsTSC is a VLM agentic system with generator, reflector, and modifier roles that iteratively refines a knowledge bank to improve few-shot multimodal time series classification and produce human-readable explanations.
-
What If We Let Forecasting Forget? A Sparse Bottleneck for Cross-Variable Dependencies
MS-FLOW uses a capacity-limited sparse routing mechanism to model only critical inter-variable dependencies in time series data, achieving state-of-the-art accuracy on 12 benchmarks with fewer but more reliable connections.
-
Exploring the Potential of Probabilistic Transformer for Time Series Modeling: A Report on the ST-PT Framework
ST-PT turns transformers into explicit factor graphs for time series, enabling structural injection of symbolic priors, per-sample conditional generation, and principled latent autoregressive forecasting via MFVI iterations.
-
CAARL: In-Context Learning for Interpretable Co-Evolving Time Series Forecasting
CAARL decomposes co-evolving time series into autoregressive segments, builds a temporal dependency graph, serializes it into a narrative, and uses LLMs for interpretable forecasting via chain-of-thought reasoning.
-
M3R: Localized Rainfall Nowcasting with Meteorology-Informed MultiModal Attention
M3R improves localized rainfall nowcasting by using weather station time series as queries in multimodal attention to selectively extract precipitation patterns from radar imagery.
-
A General Framework for Generative Self-supervised Learning in Non-invasive Estimation of Physiological Parameters Using Photoplethysmography
TS2TC combines cross-temporal fusion generative anchor pretraining with dual-process transfer to achieve 2.49% lower RMSE than prior methods on PPG parameter estimation using only 10% labeled data.
-
A Benchmark of Classical and Deep Learning Models for Agricultural Commodity Price Forecasting on A Novel Bangladeshi Market Price Dataset
AgriPriceBD dataset of 1779 daily prices released; naive persistence outperforms deep models like Informer and Time2Vec-Transformer on heterogeneous Bangladeshi commodity series with statistical validation.
-
A Foundation Model for Instruction-Conditioned In-Context Time Series Tasks
iAmTime is a time-series foundation model that uses instruction-conditioned in-context learning from demonstrations to perform zero-shot adaptation on forecasting, imputation, classification, and related tasks.
-
Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling
Timer-S1 is a released 8.3B-parameter MoE time series model that achieves state-of-the-art MASE and CRPS scores on GIFT-Eval using serial scaling and Serial-Token Prediction.
-
Probabilistic NDVI Forecasting from Sparse Satellite Time Series and Weather Covariates
A neural architecture with a horizon-weighted quantile loss forecasts field-level NDVI from irregular satellite observations and weather covariates, outperforming baselines on European data.
-
AlphaCast: A Human Wisdom-LLM Intelligence Co-Reasoning Framework for Interactive Time Series Forecasting
AlphaCast is a training-free LLM framework that performs interactive multi-stage reasoning for time series forecasting by integrating feature extraction, knowledge bases, case libraries, and contextual pools.
-
MAP4TS: A Multi-Aspect Prompting Framework for Time-Series Forecasting with Large Language Models
MAP4TS combines global, local, statistical, and temporal prompts derived from classical time-series analysis with raw embeddings via cross-modality alignment to improve LLM forecasting performance across eight datasets.
-
ReNF: Rethinking the Design of Neural Long-Term Time Series Forecasters
ReNF proposes Boosted Direct Output (BDO) and parameter smoothing so a basic temporal MLP outperforms complex state-of-the-art models on long-term time series forecasting benchmarks by implicitly combining forecasts to reduce uncertainty.
-
Masked Training for Robust Arrhythmia Detection from Digitalized Multiple Layout ECG Images
PatchECG applies masked patch training and disordered attention to handle asynchronous and partially missing ECG signals from varied layouts, reaching average AUROC 0.835 on simulated conditions and 0.778 on real hospital images for atrial fibrillation.
-
Time Series Forecasting as Reasoning: A Slow-Thinking Approach with Reinforced LLMs
Time-R1 trains LLMs via supervised fine-tuning followed by reinforcement learning with a time-series-specific reward and non-uniform GRIP sampling to enable multi-step reasoning that improves forecasting accuracy.
-
Non-stationary Diffusion For Probabilistic Time Series Forecasting
NsDiff combines a denoising diffusion conditional generative model with a pre-trained mean/variance estimator and an uncertainty-aware noise schedule based on the Location-Scale Noise Model to capture time-varying uncertainty in probabilistic forecasting.
-
Titans: Learning to Memorize at Test Time
Titans combine attention for current context with a learnable neural memory for long-term history, achieving better performance and scaling to over 2M-token contexts on language, reasoning, genomics, and time-series tasks.
-
AutoPV: Automatically Design Your Photovoltaic Power Forecasting Model
AutoPV applies neural architecture search with a custom search space drawn from time series forecasting and photovoltaic models to automatically produce architectures that outperform predefined state-of-the-art models on a Chinese solar station dataset.
-
Reasoning through Verifiable Forecast Actions: Consistency-Grounded RL for Financial LLMs
StockR1 unifies LLM-based financial reasoning and time-series forecasting by emitting verifiable forecast actions that condition a decoder, optimized via consistency-grounded RL to improve accuracy on QA and prediction tasks.
-
Atoms of Thought: Universal EEG Representation Learning with Microstates
Microstate tokenizer from clustered EEG signals provides universal representations that outperform traditional time- and frequency-domain features across sleep staging, emotion recognition, and motor imagery tasks.
-
Quantifying the Pre-training Dividend: Generative versus Latent Self-Supervised Learning for Time Series Foundation Models
Self-supervised pre-training delivers large gains up to 375% on time series anomaly detection and classification but only marginal benefits for forecasting, driven by a precision-invariance trade-off in the learned representations.
-
Toward World Modeling of Physiological Signals with Chaos-Theoretic Balancing and Latent Dynamics
NormWear-2 encodes physiological signals and interventions into a shared latent space, models their joint evolution as a dynamical system, and uses chaos-theoretic balancing during pretraining to achieve superior multi-scale forecasting on diverse real-world datasets.
-
Beyond Similarity: Temporal Operator Attention for Time Series Analysis
Temporal Operator Attention augments softmax attention with learnable sequence-space operators for signed temporal mixing and uses stochastic regularization to enable practical training, yielding consistent gains on time series benchmarks.
-
Mela: Test-Time Memory Consolidation based on Transformation Hypothesis
Mela is a Transformer variant with a dual-frequency Hierarchical Memory Module and MemStack that performs test-time memory consolidation, outperforming baselines on long contexts.
-
Risk-Aware Safe Throughput Forecasting for Starlink Networks
BG-CFQS provides risk-aware quantile-based forecasting for Starlink throughput that meets overestimation budgets and reduces positive errors compared to other feasible methods.
-
TSNN: A Non-parametric and Interpretable Framework for Traffic Time Series Forecasting
TSNN matches time series entries to a training-derived memory bank to forecast traffic without any trainable parameters and achieves competitive accuracy on four real-world datasets.
-
Learning Fingerprints for Medical Time Series with Redundancy-Constrained Information Maximization
A self-supervised method learns a fixed set of disentangled fingerprint tokens from medical time series by combining reconstruction loss with a total coding rate diversity penalty, framed as a disentangled rate-distortion problem.
-
MedMamba: Recasting Mamba for Medical Time Series Classification
MedMamba introduces a principle-guided bidirectional multi-scale Mamba model that outperforms prior methods on EEG, ECG, and activity classification benchmarks while delivering 4.6x inference speedup.
-
Foundation Models Defining A New Era In Sensor-based Human Activity Recognition: A Survey And Outlook
The survey organizes foundation models for sensor-based HAR into a lifecycle taxonomy and identifies three trajectories: HAR-specific models from scratch, adaptation of general time-series models, and integration with large language models.
-
MSTN: A Lightweight and Fast Model for General TimeSeries Analysis
MSTN introduces a lightweight multi-scale temporal network using convolutional encoding, recurrent or attention-based modeling, and gated fusion to achieve claimed state-of-the-art results on 21 of 27 time series benchmarks while using under 1.1M parameters and fast inference.
-
Characteristic Root Analysis and Regularization for Linear Time Series Forecasting
Characteristic roots govern dynamics in linear forecasting models but noise induces spurious roots; rank reduction and Root Purge regularization mitigate this for more robust predictions.
-
RadarPLM: Adapting Pre-trained Language Models for Marine Radar Target Detection by Selective Fine-tuning
RadarPLM adapts PLMs for marine radar target detection with lightweight adaptation and selective fine-tuning based on online learning values, reporting at least 6.35% average detection gains in low SCR conditions.
-
Out-of-Distribution Generalization in Time Series: A Survey
This is the first comprehensive survey of OOD generalization methodologies for time series, organized across data distribution, representation learning, and OOD evaluation.
-
Parametric Prior Mapping Framework for Non-stationary Probabilistic Time Series Forecasting
PPM injects parametric structural priors into generative models via a learnable mapping to improve probabilistic forecasts on non-stationary MTS data.