Discretizing Logged Interaction Data Biases Learning for Decision-Making

Peter Schulam , Suchi Saria

Authors on Pith no claims yet

classification 📊 stat.ML cs.AIcs.LGcs.SY

keywords databiasdecision-makingdiscretizationintervalsmodelsseriestime

read the original abstract

Time series data that are not measured at regular intervals are commonly discretized as a preprocessing step. For example, data about customer arrival times might be simplified by summing the number of arrivals within hourly intervals, which produces a discrete-time time series that is easier to model. In this abstract, we show that discretization introduces a bias that affects models trained for decision-making. We refer to this phenomenon as discretization bias, and show that we can avoid it by using continuous-time models instead.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

The hidden risks of temporal resampling in clinical reinforcement learning
cs.LG 2026-02 conditional novelty 6.0

Resampling clinical time series into uniform bins for offline RL reduces performance by up to 60% and causes retrospective evaluations to overestimate returns by 1.5-3x versus unprocessed data.