pith. sign in

arxiv: 2410.09296 · v4 · submitted 2024-10-11 · 💻 cs.CR · cs.DS· stat.AP· stat.ML

The 2020 US Decennial Census is more private than you (might) think

Pith reviewed 2026-05-23 18:55 UTC · model grok-4.3

classification 💻 cs.CR cs.DSstat.APstat.ML
keywords differential privacyUS Censusdisclosure avoidance systemf-differential privacygeographical levelsnoise reductionprivacy budgetcensus tabulations
0
0 comments X

The pith

The 2020 Census delivers stronger privacy protections than its published guarantees at all eight geographical levels.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that the noise added to 2020 Census tabulations exceeded what was needed to meet the Bureau's announced privacy targets. By applying f-differential privacy to compose the privacy losses from queries run at national, state, county, tract, and block levels, the authors obtain tighter bounds than the published ones. This excess privacy margin means the injected noise could have been smaller while still satisfying the nominal guarantees. The result matters because lower noise directly raises the accuracy of population counts and other statistics used for funding, redistricting, and research. The authors illustrate the practical gain by showing reduced distortion in an earnings-versus-education analysis when noise is lowered.

Core claim

The 2020 U.S. Census disclosure avoidance system achieves stronger privacy than its nominal differential privacy guarantees at each of the eight geographical levels because the composition of the private queries can be tracked more tightly with f-differential privacy than the published bounds assume. Consequently the noise variances injected into the tabulations could be reduced by 15.08 percent to 24.82 percent while maintaining nearly the same level of privacy protection for every level.

What carries the argument

f-differential privacy composition of the private queries run across the eight geographical levels

If this is right

  • Noise variances in the privatized census tabulations can be reduced by 15.08% to 24.82% at each geographical level while preserving the nominal privacy level.
  • The accuracy of published census statistics increases as a direct result of the lower noise.
  • Downstream applications that use the privatized data, such as regression studies of earnings and education, experience measurably less distortion from privacy noise.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Census designs for future decades could allocate their total privacy budget more efficiently by adopting the same tight composition analysis at the design stage.
  • Other statistical agencies that release multi-level geographic data under differential privacy may be over-injecting noise for the same reason.
  • Policymakers who rely on census counts for apportionment or funding formulas would receive higher-fidelity inputs if the identified noise reductions were applied.

Load-bearing premise

The composition of the private queries across the eight geographical levels can be tracked precisely with f-differential privacy without unaccounted dependencies or extra privacy costs arising from the specific structure of the Census Bureau's disclosure avoidance system.

What would settle it

A re-computation of the total privacy loss using the exact sequence of queries and the f-DP accountant that shows the realized privacy parameter is no stronger than the nominal published value would falsify the central claim.

Figures

Figures reproduced from arXiv: 2410.09296 by Buxin Su, Chendi Wang, Weijie J. Su.

Figure 1
Figure 1. Figure 1: Overview of the disclosure avoidance system for the 2020 Census Demographic and Housing Characteristics File (16, 17). The omitted geographical levels are tract subset group, tract subset, optimized block group, and population estimates primitive geography (PEPG). guarantees for the 2020 Census Demographic and Housing Characteristics File (DHC), a key data product from the 2020 Census (16, 17). We show tha… view at source ↗
Figure 2
Figure 2. Figure 2: Comparison of (ϵ, δ)-curves between our method (blue) and the Census Bureau’s accounting method (red) across eight geographical levels of the 2020 DHC. The noise configuration follows the privacy-loss budget allocation released by the Bureau on August 25, 2022 (33), as detailed in [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Comparison of (ϵ, δ)-curves from [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Comparison of trade-off functions (24) between our method (blue and black) and the Census Bureau’s accounting method (red) across eight geographical levels of the 2020 DHC. The blue (Ours with Same Noise) and red (Bureau’s) curves correspond to the same noise levels as in [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 8
Figure 8. Figure 8: clearly indicates that our proposed allocation method consistently achieves lower MSE compared to the Bureau’s official implementation. Moreover, the MSE reduction is no￾tably more substantial for blocks with larger populations, as demonstrated in the first three figures shown in [PITH_FULL_IMAGE:figures/full_fig_p007_8.png] view at source ↗
Figure 7
Figure 7. Figure 7: (ϵ, δ(ϵ))-curves under composition of all eight geographical levels of the 2020 U.S. Census. The black curve uses variance proxy that is reduced by 8.59%. The comparison in terms of trade-off function is shown in [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 9
Figure 9. Figure 9: Reduced distortion in estimating the slope coefficient due to privacy constraints for downstream analysis of private census data, measured in terms of MAE Eq. (3.1). Noise variances follow the configuration specified in [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Comparisons of pmf and approximation with σ 2 = 5 (left) and σ 2 = 25 (right). 0 5 10 15 20 25  −14 −12 −10 −8 −6 −4 −2 0 log10 ( δ ) American Community Survey with σ 2 = 200.0 Bureau’s Ours 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5  −14 −12 −10 −8 −6 −4 −2 0 log10 ( δ ) American Community Survey with σ 2 = 400.0 Bureau’s Ours [PITH_FULL_IMAGE:figures/full_fig_p029_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Comparisons with zCDP using American Community Survey 5-year data, a smaller (ϵ, δ)-curve means the privatized dataset is more private. 0.0 0.2 0.4 0.6 0.8 1.0 type I error 0.0 0.2 0.4 0.6 0.8 1.0 type II error Bureau’s Ours (Same Noise) Ours (Reduced Noise) [PITH_FULL_IMAGE:figures/full_fig_p029_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Trade-off functions for all geographical level of the 2020 U.S. Census, under the same noise level or after reducing the variance proxy by 8.59% in our method. Su, Su, and Wang et al. PNAS | September 25, 2025 | vol. XXX | no. XX | 29 [PITH_FULL_IMAGE:figures/full_fig_p029_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Zoomed-in trade-off functions for County of the 2020 U.S. Census, under the same noise level or after reducing the variance proxy by 8.59% in our method. −15 −10 −5 log10(δ) 0.30 0.35 0.40 0.45 0.50 0.55 0.60 Improvement in  US −15 −10 −5 log10(δ) 1.0 1.2 1.4 1.6 Improvement in  State −15 −10 −5 log10(δ) 0.6 0.7 0.8 0.9 1.0 Improvement in  County −15 −10 −5 log10(δ) 0.7 0.8 0.9 1.0 1.1 1.2 Improvement … view at source ↗
Figure 14
Figure 14. Figure 14: Improvement in ϵ (x-axis) for each geographical level of the 2020 U.S. Census, using f-DP based accounting method under the same setting as [PITH_FULL_IMAGE:figures/full_fig_p030_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Percentage of improvement in ϵ by using f-DP based accounting method for each geographical level of the 2020 U.S. Census, under the same setting as [PITH_FULL_IMAGE:figures/full_fig_p031_15.png] view at source ↗
read the original abstract

The U.S. Decennial Census serves as the foundation for many high-profile policy decision-making processes, including federal funding allocation and redistricting. In 2020, the Census Bureau adopted differential privacy to protect the confidentiality of individual responses through a disclosure avoidance system that injects noise into census data tabulations. The Bureau subsequently posed an open question: Could stronger privacy guarantees be obtained for the 2020 U.S. Census compared to their published guarantees, or equivalently, had the privacy budgets been fully utilized? In this paper, we address this question affirmatively by demonstrating that the 2020 U.S. Census provides significantly stronger privacy protections than its nominal guarantees suggest at each of the eight geographical levels, from the national level down to the block level. This finding is enabled by our precise tracking of privacy losses using $f$-differential privacy, applied to the composition of private queries across these geographical levels. Our analysis reveals that the Census Bureau introduced unnecessarily high levels of noise to meet the specified privacy guarantees for the 2020 Census. Consequently, we show that noise variances could be reduced by $15.08\%$ to $24.82\%$ while maintaining nearly the same level of privacy protection for each geographical level, thereby improving the accuracy of privatized census statistics. We empirically demonstrate that reducing noise injection into census statistics mitigates distortion caused by privacy constraints in downstream applications of private census data, illustrated through a study examining the relationship between earnings and education.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that precise f-differential privacy tracking of the composition of private queries in the 2020 Census Disclosure Avoidance System across eight geographical levels (national to block) shows that actual privacy loss is lower than the nominal guarantees, allowing noise variance reductions of 15.08% to 24.82% at each level while preserving nearly equivalent privacy, with an empirical demonstration that lower noise improves a downstream earnings-education regression.

Significance. If the f-DP accounting is shown to be tight for the deployed top-down DAS, the result would indicate that the Census Bureau could have released more accurate tabulations without exceeding its privacy budget, with measurable utility gains in policy-relevant analyses. The work directly addresses an open question posed by the Census Bureau and supplies concrete percentage reductions and a downstream case study.

major comments (2)
  1. [f-DP composition analysis (likely §4 or §5)] The central claim rests on the assertion that standard f-DP composition across the eight levels exactly bounds the privacy loss of the deployed DAS. However, the manuscript does not provide a formal argument or theorem showing that the top-down algorithm's consistency constraints, shared randomness, and post-processing steps introduce no additional privacy loss beyond the independent composition of the per-level mechanisms. If such terms exist, the reported 15-24% noise reduction would exceed the nominal budget.
  2. [Empirical evaluation section] The empirical demonstration in the earnings-education study uses the reduced-noise data but does not report the exact privacy parameters (f-DP curves or effective ε) achieved after the proposed variance reduction at each geographic level, making it impossible to verify that the privacy guarantee remains 'nearly the same' as the nominal one.
minor comments (2)
  1. Notation for the eight geographical levels and the mapping from queries to levels should be introduced earlier and used consistently.
  2. The abstract states precise tracking was performed but the main text should include a table or appendix listing the per-level query sets and their individual f-DP parameters before composition.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our f-DP analysis of the 2020 Census DAS. We address each major point below and will revise the manuscript accordingly to strengthen the formal justification and empirical reporting.

read point-by-point responses
  1. Referee: [f-DP composition analysis (likely §4 or §5)] The central claim rests on the assertion that standard f-DP composition across the eight levels exactly bounds the privacy loss of the deployed DAS. However, the manuscript does not provide a formal argument or theorem showing that the top-down algorithm's consistency constraints, shared randomness, and post-processing steps introduce no additional privacy loss beyond the independent composition of the per-level mechanisms. If such terms exist, the reported 15-24% noise reduction would exceed the nominal budget.

    Authors: The DAS applies independent Gaussian mechanisms at each geographic level whose privacy losses are tracked via f-DP composition; consistency constraints are enforced by post-processing of the noisy counts, which cannot increase privacy loss by the post-processing property. Shared randomness across levels is already folded into the per-level f-DP parameters before composition. We will add a short lemma (with proof) in §4 establishing that these implementation details introduce no extra loss beyond the composed mechanisms, thereby confirming the reported variance reductions remain within the nominal budget. revision: yes

  2. Referee: [Empirical evaluation section] The empirical demonstration in the earnings-education study uses the reduced-noise data but does not report the exact privacy parameters (f-DP curves or effective ε) achieved after the proposed variance reduction at each geographic level, making it impossible to verify that the privacy guarantee remains 'nearly the same' as the nominal one.

    Authors: We agree that explicit verification of the post-reduction privacy parameters is necessary. In the revised manuscript we will add a table in the empirical section listing the f-DP curves (or equivalent effective ε at δ=10^{-10}) at each of the eight levels both before and after the 15.08–24.82% variance reductions, confirming that the new parameters remain at or below the nominal guarantees. revision: yes

Circularity Check

0 steps flagged

No circularity: f-DP composition applied to external query structure

full rationale

The paper computes composed f-DP privacy loss from the Census Bureau's published query structure and noise scales across eight geographic levels, then compares the resulting effective privacy to the nominal published budgets. This uses standard external composition theorems for f-DP; no parameter is fitted to the target privacy loss, no self-citation supplies a uniqueness result, and no equation defines the output privacy loss in terms of itself. The 15-24% noise reduction claim follows directly from the computed gap between nominal and tracked loss. The derivation is therefore self-contained against external benchmarks and receives score 0.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The analysis rests on standard f-differential privacy composition rules applied to the Census Bureau's existing query structure; no new free parameters, ad-hoc axioms, or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption f-differential privacy composition rules apply directly to the sequence of queries released by the 2020 Census disclosure avoidance system
    Invoked to obtain the precise privacy-loss tracking across geographic levels

pith-pipeline@v0.9.0 · 5813 in / 1219 out tokens · 28594 ms · 2026-05-23T18:55:22.825863+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

69 extracted references · 69 canonical work pages · 3 internal anchors

  1. [1]

    AI/AN” and “NH/PI

    MSE between the non-privatized 2010 Census Summary Files and the simulated privacy-protected Summary Files after non-negative postprocessing, across nine racial query; 2) the relationship between earnings and education level using the 2020 ACS 5-year estimates (27, 45). D.1. Impact on racial group counts.We analyze the MSE between the non-privatized 2010 ...

  2. [2]

    Census in comparison with the privacy levels published by the Census Bureau

    Discussion In this paper, we have analyzed the privacy guarantees of the 2020 U.S. Census in comparison with the privacy levels published by the Census Bureau. Our analysis demonstrates that the actual privacy guarantee is significantly stronger than that provided by the Bureau’s existing approach, as evidenced by our uniformly smallerϵvalue for anyδ. Thi...

  3. [3]

    (United States Census Bureau)

    Hotchkiss M, Phelan J (2017)Uses of Census Bureau data in federal funds distribution: A new design for the 21st century. (United States Census Bureau)

  4. [4]

    US Census Bureau (2023) Census bureau data guide more than $2.8 trillion in federal funding in fiscal year 2021

  5. [5]

    (2023) Comment: The Essential Role of Policy Evaluation for the 2020 Census DisclosureAvoidance System.Harvard Data Science Review(Special Issue 2)

    Kenny CT, et al. (2023) Comment: The Essential Role of Policy Evaluation for the 2020 Census DisclosureAvoidance System.Harvard Data Science Review(Special Issue 2)

  6. [6]

    (Schloss Dagstuhl - Leibniz-Zentrum für Informatik), Vol

    Cohen A, Duchin M, Matthews J, Suwal B (2021) Census TopDown: The impacts of differential privacy on redistricting in2nd Symposium on Foundations of Responsible Computing, FORC 2021, June 9-11, 2021, Virtual Conference, LIPIcs. (Schloss Dagstuhl - Leibniz-Zentrum für Informatik), Vol. 192, pp. 5:1–5:22

  7. [7]

    The Quarterly Journal of Economics118(1):157–206

    Autor DH, Duggan MG (2003) The rise in the disability rolls and the decline in unemployment. The Quarterly Journal of Economics118(1):157–206

  8. [8]

    census.gov/topics/employment/labor-force/guidance.html)

    US Census Bureau (2021) Guidance for labor force statistics data users ( https://www. census.gov/topics/employment/labor-force/guidance.html)

  9. [9]

    fas.org/crs/misc/IN11360.pdf)

    Eckman SJ (2021) Apportionment and redistricting following the 2020 census (https://sgp. fas.org/crs/misc/IN11360.pdf)

  10. [10]

    US Census Bureau (2021) 2020 census apportionment results (https://www.census.gov/ data/tables/2020/dec/2020-apportionment-data.html)

  11. [11]

    Duncan G, Lambert D (1989) The risk of disclosure for microdata.Journal of Business & Economic Statistics7(2):207–217

  12. [12]

    (2023) Confidence-ranked reconstruction of census microdata from published statistics.Proceedings of the National Academy of Sciences120(8):e2218605120

    Dick T, et al. (2023) Confidence-ranked reconstruction of census microdata from published statistics.Proceedings of the National Academy of Sciences120(8):e2218605120

  13. [13]

    Hawes M (2022) Reconstruction and re-identification of the demographic and housing charac- teristics file (dhc)

  14. [14]

    Abowd JM (2019) Staring down the database reconstruction theorem

  15. [15]

    3: 21-cv-00211-rah-ecm-kcn, the state of alabama v

    Abowd J (2021) Declaration of John Abowd in case no. 3: 21-cv-00211-rah-ecm-kcn, the state of alabama v. united states department of commerce

  16. [16]

    8| Su, Su, and Wanget al

    Dwork C, McSherry F , Nissim K, Smith A (2006) Calibrating noise to sensitivity in private data analysis.Theory Of Cryptography, Proceedings3876:265–284. 8| Su, Su, and Wanget al

  17. [17]

    Petersburg, Russia, May 28-June 1, 2006

    Dwork C, Kenthapadi K, McSherry F , Mironov I, Naor M (2006) Our data, ourselves: Privacy via distributed noise generation inAdvances in Cryptology-EUROCRYPT 2006: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques, St. Petersburg, Russia, May 28-June 1, 2006. Proceedings 25. (Springer), pp. 486–503

  18. [18]

    (2022) The 2020 census disclosure avoidance system TopDown algorithm

    Abowd JM, et al. (2022) The 2020 census disclosure avoidance system TopDown algorithm. Harvard Data Science Review(Special Issue 2)

  19. [19]

    (2022) Invited lecture: The u.s

    Abowd JM, et al. (2022) Invited lecture: The u.s. census bureau adopts differential privacy inKDD ’18: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining

  20. [20]

    US Census Bureau (2023) Phillips v

    Phillips v. US Census Bureau (2023) Phillips v. U.S. Census Bureau (https://thearp.org/ litigation/phillips-v-us-census-bureau/)

  21. [21]

    (2021) The use of differential privacy for census data and its impact on redistricting: The case of the 2020 U.S

    Kenny CT, et al. (2021) The use of differential privacy for census data and its impact on redistricting: The case of the 2020 U.S. census.Science Advances7(41):eabk3283

  22. [22]

    Kenny CT, McCartan C, Simko T, Imai K (2024) Census officials must constructively en- gage with independent evaluations.Proceedings of the National Academy of Sciences 121(11):e2321196121

  23. [23]

    (Y ale University Press)

    Anderson MJ (2015)The American census: A social history. (Y ale University Press)

  24. [24]

    Census Bureau’s Use of Differential Privacy.Harvard Data Science Review(Special Issue 2)

    Boyd D, Sarathy J (2022) Differential Perspectives: Epistemic Disconnects Surrounding the U.S. Census Bureau’s Use of Differential Privacy.Harvard Data Science Review(Special Issue 2)

  25. [25]

    (2022) Bayesian and frequentist semantics for common variations of differential privacy: Applications to the 2020 census.arXiv preprint arXiv:2209.03310

    Kifer D, et al. (2022) Bayesian and frequentist semantics for common variations of differential privacy: Applications to the 2020 census.arXiv preprint arXiv:2209.03310

  26. [26]

    Dong J, Roth A, Su WJ (2022) Gaussian differential privacy.Journal of the Royal Statistical Society: Series B (Statistical Methodology)84(1):3–37

  27. [27]

    Balle B, Barthe G, Gaboardi M (2020) Privacy profiles and amplification by subsampling.J. Priv. Confidentiality10(1)

  28. [28]

    US Census Bureau (2022) Privacy-protected 2010 census demonstration data | ipums nhgis (https://www.nhgis.org/privacy-protected-2010-census-demonstration-data# v20220825-files)

  29. [29]

    Census Bureau)

    US Census Bureau (2020) Educational attainment (U.S. Census Bureau)

  30. [30]

    (Curran Associates, Inc.), Vol

    Canonne CL, Kamath G, Steinke T (2020) The discrete Gaussian for differential privacy in Advances in Neural Information Processing Systems. (Curran Associates, Inc.), Vol. 33, pp. 15676–15688

  31. [31]

    Dwork C, Rothblum GN (2016) Concentrated differential privacy.arXiv preprint arXiv:1603.01887

  32. [32]

    (Springer), pp

    Bun M, Steinke T (2016) Concentrated differential privacy: Simplifications, extensions, and lower bounds inTheory of Cryptography Conference. (Springer), pp. 635–658

  33. [33]

    Bun M, Dwork C, Rothblum GN, Steinke T (2018) Composable and versatile privacy via truncated CDP inProceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing. pp. 74–86

  34. [34]

    Micciancio D, Regev O (2007) Worst-case to average-case reductions based on Gaussian measures.SIAM Journal on Computing37(1):267–302

  35. [35]

    US Census Bureau (2022) Privacy-loss budget allocation 2022-08-25 ( https: //www2.census.gov/programs-surveys/decennial/2020/program-management/ data-product-planning/2010-demonstration-data-products/02-Demographic_ and_Housing_Characteristics/2022-08-25_Summary_File/2022-08-25_ Privacy-Loss_Budget_Allocations.pdf)

  36. [36]

    ACM53(9):89–97

    McSherry F (2010) Privacy integrated queries: an extensible platform for privacy-preserving data analysis.Commun. ACM53(9):89–97

  37. [37]

    (2022) Making the most of parallel composition in differential privacy.Proc

    Smith J, et al. (2022) Making the most of parallel composition in differential privacy.Proc. Priv. Enhancing Technol.2022(1):253–273

  38. [38]

    Kairouz P , Oh S, Viswanath P (2017) The composition theorem for differential privacy.IEEE Trans. Inf. Theory63(6):4037–4049

  39. [39]

    (IEEE), pp

    Mironov I (2017) Rényi differential privacy in2017 IEEE 30th computer security foundations symposium (CSF). (IEEE), pp. 263–275

  40. [40]

    Bu Z, Dong J, Long Q, Su WJ (2020) Deep learning with Gaussian differential privacy.Harvard Data Science Review2020(23):10–1162

  41. [41]

    Wang C, Su B, Y e J, Shokri R, Su WJ (2024) Unified enhancement of privacy bounds for mixture mechanisms viaf-differential privacy.Advances in Neural Information Processing Systems36

  42. [42]

    Su WJ (2024) A statistical viewpoint on differential privacy: Hypothesis testing, representation and Blackwell’s theorem.arXiv preprint arXiv:2409.09558

  43. [43]

    Census Bureau)

    US Census Bureau (2020) Selected social characteristics in the united states (U.S. Census Bureau). Accessed on 4 October 2024

  44. [44]

    Census Bureau)

    US Census Bureau (2020) Selected economic characteristics (U.S. Census Bureau). Accessed on 4 October 2024

  45. [45]

    Census Bureau)

    US Census Bureau (2020) Selected housing characteristics (U.S. Census Bureau). Accessed on 4 October 2024

  46. [46]

    Census Bureau)

    US Census Bureau (2020) Acs demographic and housing estimates (U.S. Census Bureau). Accessed on 4 October 2024

  47. [47]

    BMJ324(7328):23

    Muller A (2002) Education, income inequality, and mortality: a multiple regression analysis. BMJ324(7328):23

  48. [48]

    census bureau’s privacy protection methods.Science Advances10(18):eadl2524

    Kenny CT, McCartan C, Kuriwaki S, Simko T, Imai K (2024) Evaluating bias and noise induced by the u.s. census bureau’s privacy protection methods.Science Advances10(18):eadl2524

  49. [49]

    Cumings-Menon R (2024) Full-information estimation for hierarchical data.arXiv preprint arXiv:2404.13164

  50. [50]

    Drechsler J, Globus-Harris I, Mcmillan A, Sarathy J, Smith A (2022) Nonparametric differentially private confidence intervals for the median.Journal of Survey Statistics and Methodology 10(3):804–829

  51. [51]

    Awan J, Edwards A, Bartholomew P , Sillers A (2024) Best linear unbiased estimate from privatized histograms.arXiv preprint arXiv:2409.04387

  52. [52]

    Sullivan TA (2020) Coming to Our Census: How Social Statistics Underpin Our Democracy (and Republic).Harvard Data Science Review2(1)

  53. [53]

    (Norton)

    Ansolabehere S, Snyder J (2008)The End of Inequality: One Person, One Vote and the Transformation of American Politics, Issues in American democracy. (Norton)

  54. [54]

    (2024) Geographic spines in the 2020 census disclosure avoidance system.Journal of Privacy and Confidentiality14(3)

    Cumings-Menon R, et al. (2024) Geographic spines in the 2020 census disclosure avoidance system.Journal of Privacy and Confidentiality14(3)

  55. [55]

    Census Bureau)

    US Census Bureau (2023) DAS-implementation-details (U.S. Census Bureau)

  56. [56]

    (PMLR), Vol

    Kairouz P , Liu Z, Steinke T (2021) The distributed discrete gaussian mechanism for federated learning with secure aggregation inProceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, Proceedings of Machine Learning Research. (PMLR), Vol. 139, pp. 5201–5212

  57. [57]

    (PMLR), pp

    Zhu Y , Dong J, Wang YX (2022) Optimal accounting of differential privacy via characteristic function inInternational Conference on Artificial Intelligence and Statistics. (PMLR), pp. 4782–4817

  58. [58]

    (PMLR), Vol

    Koskela A, Jälkö J, Honkela A (2020) Computing tight differential privacy guarantees using FFT inThe 23rd International Conference on Artificial Intelligence and Statistics, AISTATS 2020, 26-28 August 2020, Online [Palermo, Sicily, Italy], Proceedings of Machine Learning Research. (PMLR), Vol. 108, pp. 2560–2569

  59. [59]

    Gopi S, Lee YT, Wutschitz L (2021) Numerical composition of differential privacy inAdvances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual. pp. 11631–11642

  60. [60]

    Chiappa S, Calandra R

    Balle B, Barthe G, Gaboardi M, Hsu J, Sato T (2020) Hypothesis testing interpretations and renyi differential privacy inThe 23rd International Conference on Artificial Intelligence and Statistics, AISTATS 2020, 26-28 August 2020, Online [Palermo, Sicily, Italy], Proceedings of Machine Learning Research, eds. Chiappa S, Calandra R. (PMLR), Vol. 108, pp. 2496–2506

  61. [61]

    (Springer, New Y ork), Third edition, pp

    Lehmann EL, Romano JP (2005)Testing statistical hypotheses, Springer Texts in Statistics. (Springer, New Y ork), Third edition, pp. xiv+784

  62. [62]

    Wang H, Gao S, Zhang H, Shen M, Su WJ (2022) Analytical composition of differential privacy via the Edgeworth accountant.arXiv preprint arXiv:2206.04236

  63. [63]

    Kiayias A, Kohlweiss M, Wallden P , Zikas V

    Genise N, Micciancio D, Peikert C, Walter M (2020) Improved discrete gaussian and subgaus- sian analysis for lattice cryptography inPublic-Key Cryptography - PKC 2020 - 23rd IACR International Conference on Practice and Theory of Public-Key Cryptography, Edinburgh, UK, May 4-7, 2020, Proceedings, Part I, Lecture Notes in Computer Science, eds. Kiayias A, ...

  64. [64]

    (Cambridge university press) Vol

    Durrett R (2019)Probability: theory and examples. (Cambridge university press) Vol. 49

  65. [65]

    18 and older

    Sablonnière P , Sbibih D, Tahrichi M (2010) Error estimate and extrapolation of a quadrature formula derived from a quartic spline quasi-interpolant.BIT50(4):843–862. Su, Su, and Wanget al. PNAS |September 25, 2025| vol. XXX | no. XX |9 A. Technical proofs and details This section presents our main methodology and key tools for deriving the privacy profil...

  66. [66]

    According to (28), the discrete Gaussian mechanism isρ-zCDP if we takeσ2 = ∆ 2 M/2ρ.zCDP is currently adopted by the Bureau to count the privacy budget of the 2020 Census

    by treating the binary categories as the 2-fold composition of two counting queries. According to (28), the discrete Gaussian mechanism isρ-zCDP if we takeσ2 = ∆ 2 M/2ρ.zCDP is currently adopted by the Bureau to count the privacy budget of the 2020 Census. A better zCDP guarantee for the discrete Gaussian is also investigated by (54). The Bureau obtained ...

  67. [67]

    The technical details of all this section is similar to Section F

    The results presented here are used to derive the trade-off functions shown in Figure 4 and Figure 12. The technical details of all this section is similar to Section F. The trade-off function is uniquely determined by the following parametric equation. α(ζ) =PXi∼NZ(0,σ2 i ) ( k∑ i=1 1 σ2 i ni∑ j=1 Xij >ζ ) +c·PXi∼NZ(0,σ2 i ) ( k∑ i=1 1 σ2 i ni∑ j=1 Xij =...

  68. [68]

    Upper bound onΩ 6.We have PXij∼NZ(0,σ2 i ) [Λc 1]≤ k∑ i=1 PXij∼NZ(0,σ2 i ) [ ⏐⏐⏐⏐⏐ 1√ni ni∑ j=1 Xij ⏐⏐⏐⏐⏐>12·σi ]

    +ν(Λc 2) + ⏐⏐⏐⏐⏐PXij∼NZ(0,σ2 i ) [ k∑ i=1 ai ni∑ j=1 Xij >t ϵ,Λ 1 ] −ν ( k∑ i=1 ai ¯Xi≥tϵ,Λ 2 ) ⏐⏐⏐⏐⏐ = Ω 6 + Ω7 + Ω8. Upper bound onΩ 6.We have PXij∼NZ(0,σ2 i ) [Λc 1]≤ k∑ i=1 PXij∼NZ(0,σ2 i ) [ ⏐⏐⏐⏐⏐ 1√ni ni∑ j=1 Xij ⏐⏐⏐⏐⏐>12·σi ] . According to Eq. (A.2), ∑ni j=1Xij is sub-Gaussian with variance proxy √ niσ2 i.As a result, it holds PXij∼NZ(0,σ2 i ) [ ⏐...

  69. [69]

    Therefore, the error bound is given by ⏐⏐⏐⏐⏐⏐ ∫ 1 100 0 F(t)dt− N−1 4∑ k=1 2h 45×(7F(x4l−4) + 32F(x4l−3) + 12F(x4l−2) + 32F(x4l−1) + 7F(x4l)) ⏐⏐⏐⏐⏐⏐ ≤2 945×1.2×1035×1 100× ( 1 100×107 )6 <2.54×10−24. E. Supplementary figures 28| Su, Su, and Wanget al. □4 □2 0 2 4 i/Bn 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 Probability Density Compositions of DGM Our Appr...