Policy Choice in Time Series by Empirical Welfare Maximization
read the original abstract
This paper develops a novel method for policy choice in a dynamic setting where the available data is a multivariate time series. Overcoming challenges unique to time-series setting such as time-varying environments, history-dependent welfare, dynamic causal effects, and statistical dependence, we propose Time-series Empirical Welfare Maximization (T-EWM) methods. We characterize conditions for T-EWM to consistently learn optimal policies conditional or unconditional on the time-series history, and derive nonasymptotic upper bounds for the welfare regrets. We illustrate a use of T-EWM for optimal restriction rules against Covid-19.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Dynamic Treatment on Networks
Q-Ising integrates Bayesian dynamic Ising modeling with offline RL to enable adaptive network treatment policies that outperform static centrality benchmarks under spillovers.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.