Deep Kalman Filters

David Sontag; Rahul G. Krishnan; Uri Shalit

arxiv: 1511.05121 · v2 · pith:XNK7ORPPnew · submitted 2015-11-16 · 📊 stat.ML · cs.LG

Deep Kalman Filters

Rahul G. Krishnan , Uri Shalit , David Sontag This is my paper

classification 📊 stat.ML cs.LG

keywords modelscounterfactualfiltersinferencekalmandatasetdeepefficacy

0 comments

read the original abstract

Kalman Filters are one of the most influential models of time-varying phenomena. They admit an intuitive probabilistic interpretation, have a simple functional form, and enjoy widespread adoption in a variety of disciplines. Motivated by recent variational methods for learning deep generative models, we introduce a unified algorithm to efficiently learn a broad spectrum of Kalman filters. Of particular interest is the use of temporal generative models for counterfactual inference. We investigate the efficacy of such models for counterfactual inference, and to that end we introduce the "Healing MNIST" dataset where long-term structure, noise and actions are applied to sequences of digits. We show the efficacy of our method for modeling this dataset. We further show how our model can be used for counterfactual inference for patients, based on electronic health record data of 8,000 patients over 4.5 years.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 16 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Identify Then Project: Contrastive Learning of Latent Dynamics from Partial Observations with Port-Hamiltonian Structure
cs.LG 2026-05 unverdicted novelty 7.0

A two-stage contrastive teacher-student framework learns and then projects latent dynamics onto port-Hamiltonian submanifolds from partial observations.
Support-Safe Variational Hybrid Filtering for Contact-Mode and Sparse-Law Recovery
cs.RO 2026-05 unverdicted novelty 7.0

VHYDRO is a support-safe variational hybrid filter that jointly recovers continuous latent states, discrete contact modes, and sparse port-Hamiltonian laws per regime while preventing loss of feasible transitions.
Robust Filter Attention: Self-Attention as Precision-Weighted State Estimation
cs.LG 2025-09 unverdicted novelty 7.0

Robust Filter Attention models self-attention as consistency-based state estimation under a linear SDE for token trajectories, matching standard attention complexity while showing lower perplexity and better zero-shot...
Mastering Atari with Discrete World Models
cs.LG 2020-10 accept novelty 7.0

DreamerV2 reaches human-level performance on 55 Atari games by learning behaviors inside a separately trained discrete-latent world model.
Dream to Control: Learning Behaviors by Latent Imagination
cs.LG 2019-12 accept novelty 7.0

Dreamer learns to control from images by imagining and optimizing behaviors in a learned latent world model, outperforming prior methods on 20 visual tasks in data efficiency and final performance.
Efficient Learning of Deep State Space Models via Importance Smoothing
cs.LG 2026-05 unverdicted novelty 6.0

Introduces PVMC, a parallelizable training method for deep state space models that claims state-of-the-art results and 10x faster training than prior SMC approaches.
Generative Recursive Reasoning
cs.AI 2026-05 unverdicted novelty 6.0

GRAM is a latent-variable generative model that performs recursive reasoning via stochastic trajectories, trained with amortized variational inference to support multi-hypothesis reasoning and unconditional generation.
Generative Recursive Reasoning
cs.AI 2026-05 unverdicted novelty 6.0

GRAM turns recursive latent reasoning into a generative probabilistic model via stochastic trajectories and amortized variational inference, claiming better performance on structured reasoning tasks than deterministic...
Mechanism Learning: Prototype-Anchored Mechanism Inference for Scientific Forecasting
cs.LG 2026-05 unverdicted novelty 6.0

Mechanism learning infers active local evolution rules via prototype-anchored descriptors to achieve more robust forecasting than direct state prediction on benchmarks like Burgers, WeatherBench2, and Lorenz96.
RT-Transformer: The Transformer Block as a Spherical State Estimator
cs.LG 2026-05 unverdicted novelty 6.0

Transformer components arise as the natural solution to precision-weighted directional state estimation on the hypersphere.
Coupled-NeuralHP: Directional Temporal Coupling Between AI Innovation Exposure and Public Response
cs.CY 2026-05 unverdicted novelty 6.0

Coupled-NeuralHP finds that AI patent streams forecast public response trends better than baselines in one direction while the reverse link is unsupported, with no robust 2022 regime shift detected.
pDANSE: Particle-based Data-driven Nonlinear State Estimation from Nonlinear Measurements
eess.SP 2025-10 unverdicted novelty 6.0

pDANSE enables nonlinear state estimation for model-free processes by using RNN-parameterized Gaussian priors and reparameterization-based particle sampling to compute posterior second-order statistics from nonlinear ...
Semi-Supervised Model-Free Bayesian State Estimation from Compressed Measurements
eess.SP 2024-07 unverdicted novelty 6.0

SemiDANSE uses limited labeled measurement-state pairs plus abundant unlabeled data to achieve competitive state estimation from compressed measurements in model-free chaotic dynamical systems.
Cognitive Flexibility as a Latent Structural Operator for Bayesian State Estimation
eess.SY 2026-04 unverdicted novelty 5.0

Cognitive Flexibility is a new representation-level operator for Bayesian filters that dynamically selects latent structures via predictive scores to reduce inconsistency under mismatch while preserving the recursion ...
Adaptive Learned State Estimation based on KalmanNet
cs.RO 2026-04 unverdicted novelty 5.0

AM-KNet adds sensor-specific modules, hypernetwork conditioning on target type and pose, and Joseph-form covariance estimation to KalmanNet, yielding better accuracy and stability than base KalmanNet on nuScenes and V...
CognitiveTwin: Robust Multi-Modal Digital Twins for Predicting Cognitive Decline in Alzheimer's Disease
cs.AI 2026-04 unverdicted novelty 4.0

CognitiveTwin combines Transformer multi-modal fusion and Deep Markov Models on longitudinal AD data to deliver personalized cognitive decline predictions that are fair across demographics and robust to missing data.