pith. sign in

arxiv: 1602.04292 · v1 · pith:M7UEVBMZnew · submitted 2016-02-13 · ⚛️ physics.data-an

Identifying Excessively Rounded or Truncated Data

classification ⚛️ physics.data-an
keywords dataanalysisdigitizationoptimalstatisticalstructuretruncatedtruncation
0
0 comments X
read the original abstract

All data are digitized, and hence are essentially integers rather than true real numbers. Ordinarily this causes no difficulties since the truncation or rounding usually occurs below the noise level. However, in some instances, when the instruments or data delivery and storage systems are designed with less than optimal regard for the data or the subsequent data analysis, the effects of digitization may be comparable to important features contained within the data. In these cases, information has been irrevocably lost in the truncation process. While there exist techniques for dealing with truncated data, we propose a straightforward method that will allow us to detect this problem before the data analysis stage. It is based on an optimal histogram binning algorithm that can identify when the statistical structure of the digitization is on the order of the statistical structure of the data set itself.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.