pith. sign in

arxiv: 2604.24975 · v1 · submitted 2026-04-27 · 💻 cs.CR · cs.DB

Poisoning Learned Index Structures: Static and Dynamic Adversarial Attacks on ALEX

Pith reviewed 2026-05-08 02:28 UTC · model grok-4.3

classification 💻 cs.CR cs.DB
keywords learned indexesadversarial attacksdata poisoningalgorithmic complexity attacksindex robustnessdynamic attacksALEX
0
0 comments X

The pith

Dynamic attacks slow ALEX learned index lookups by up to 2.8 times while static poisoning causes little change.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper compares two ways an adversary might harm a learned index like ALEX, which models key distributions for fast lookups. Static poisoning inserts bad data all at once, while dynamic attacks insert keys gradually to force the index into inefficient states. Experiments on multiple datasets show that only the dynamic attacks produce large, lasting slowdowns in lookup speed. This distinction matters because learned indexes are promoted for high performance in databases, and knowing which attacks actually work helps decide where to add protections.

Core claim

Under bulk-loaded settings and realistic workloads up to 200K inserts, static poisoning leaves ALEX lookup performance nearly unchanged, but dynamic algorithmic complexity attacks reduce lookup throughput by a factor of 2 to 2.8. The damage is highly sensitive to key distribution: dense sets limit attack success because of duplicate insertions and ALEX's localized node structure. The study also notes that global performance numbers can hide localized structural harm, so control workloads are required to isolate real effects.

What carries the argument

The explicit comparison of static data poisoning against dynamic algorithmic complexity attacks (ACA) on the ALEX index structure, measured across SOSD datasets and a real-key baseline.

If this is right

  • Learned index performance under attack depends on the interaction of threat model, data distribution, and how performance is measured.
  • Localized node damage inside the index does not always translate into measurable global query slowdown.
  • Control workloads without adversarial keys are necessary to separate attack effects from normal variance.
  • Dense key distributions reduce an adversary's leverage because of duplicate handling and ALEX's node-local updates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Defenses for learned indexes should focus first on resisting gradual, online insertions rather than one-time data poisoning.
  • Other learned indexes with similar node-splitting rules may show the same static-versus-dynamic gap.
  • Evaluating attacks on a wider range of real-world key streams, beyond the SOSD collection, would test whether the 2-2.8x slowdown bound generalizes.

Load-bearing premise

The selected SOSD datasets, real-key baseline, attack code, and bulk-loaded insertion model are representative enough of real workloads that the observed difference between threat models will hold more generally.

What would settle it

A replication that inserts the same adversarial keys dynamically rather than in bulk and finds either no slowdown from dynamic ACA or comparable slowdown from static poisoning.

Figures

Figures reproduced from arXiv: 2604.24975 by Allen Jue.

Figure 1
Figure 1. Figure 1: Lookup slowdown under static poisoning as a function of poisoning budget. Across all datasets, degradation remains minimal and non-monotonic, with worst-case slowdown below 1.03× view at source ↗
Figure 2
Figure 2. Figure 2: Absolute lookup performance under static poisoning. Variations across poisoning levels are small and largely attributable to measurement noise rather than systematic degradation. (a) Black-box ACA (b) White-box ACA view at source ↗
Figure 3
Figure 3. Figure 3: Fine-grained slowdown over insertion rounds for black-box and white-box ACA. Both attacks induce substantial degradation, with peak slowdowns up to 2.7×–2.8×. Jagged trajectories reflect discrete structural events such as node splits and retraining, as well as per-batch variability in insertion locality. 4.2 Dynamic Adversarial Workloads We next evaluate algorithmic complexity attacks (ACA), where adversar… view at source ↗
Figure 4
Figure 4. Figure 4: Empirical CDF of inserted-key positions in clean￾key bins. Datasets where inserted keys occupy a larger frac￾tion of clean-rank bins exhibit higher degradation. insertions accumulate and often stabilizes at an elevated plateau. 4.3 Black-box vs. White-box Attacks We compare black-box and white-box variants of ACA to evaluate the role of structural knowledge. Peak degradation is slightly higher for white-bo… view at source ↗
Figure 5
Figure 5. Figure 5: Static poisoning key distributions across budgets 𝛼 ∈ {0.05, 0.10, 0.15, 0.20}. Adversarial keys concentrate near distribution tails but do not induce meaningful lookup degradation view at source ↗
Figure 6
Figure 6. Figure 6: Black-box ACA slowdown across insertion rounds. Degradation increases over time, peaking on Lognormal and Wiki TS, while Facebook remains comparatively stable. 7 view at source ↗
Figure 7
Figure 7. Figure 7: White-box ACA slowdown across insertion rounds. Peak degradation is comparable to black-box but not consistently higher across datasets. (a) Black-box ACA adversarial inserts in value space. Despite apparent spread, many keys occupy a narrow band of clean-rank bins (see view at source ↗
Figure 8
Figure 8. Figure 8: Adversarial key distributions in value space. 8 view at source ↗
Figure 9
Figure 9. Figure 9: Real-key control distributions in value space. 9 view at source ↗
read the original abstract

Learned index structures achieve high performance by modeling the cumulative distribution function (CDF) of keys, but this reliance on data distributions introduces potential vulnerability to adversarial manipulation. Prior work has explored both static data poisoning and dynamic algorithmic complexity attacks (ACA), though evaluations are typically limited in scale or consider only one threat model. We present a systematic study of both attack paradigms on ALEX, a state-of-the-art dynamic learned index, under a unified and reproducible framework. Our evaluation scales to realistic workloads with up to 200K adversarial inserts and includes multiple SOSD datasets with diverse key distributions, as well as a real-key baseline to isolate adversarial effects. Our results show a clear separation between threat models. Static poisoning has minimal impact on lookup performance in ALEX under bulk-loaded settings, while dynamic ACA induces substantial degradation, with up to 2--2.8x slowdown in lookup throughput. However, attack effectiveness is highly dataset-dependent: dense key distributions limit adversarial leverage due to duplicate-heavy insertions and ALEX's localized structure. We highlight key evaluation considerations, including the need for control workloads and the mismatch between localized structural damage and global query metrics. These results show that robustness in learned indexes depends critically on the interaction between threat model, data distribution, and evaluation methodology.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents a systematic empirical study of static data poisoning and dynamic algorithmic complexity attacks (ACA) on the ALEX learned index structure. Using up to 200K adversarial inserts across multiple SOSD datasets with diverse key distributions plus a real-key baseline, it reports that static poisoning has minimal impact on lookup performance under bulk-loaded settings, while dynamic ACA induces up to 2--2.8x slowdown in lookup throughput; effectiveness is highly dataset-dependent due to dense distributions and ALEX's localized structure. The work also emphasizes evaluation considerations such as control workloads and the mismatch between localized damage and global metrics.

Significance. If the quantitative separation between threat models holds under the reported controls, this work is significant for the security of learned indexes: it provides concrete evidence that dynamic ACA poses a more serious threat than static poisoning in bulk-loaded scenarios, while underscoring the role of data distribution and evaluation methodology. Strengths include the reproducible framework, scale of experiments, multiple datasets, and explicit qualifications on generalizability.

major comments (2)
  1. [Evaluation] Evaluation section: the central claim of a 'clear separation' between static (minimal impact) and dynamic (2--2.8x slowdown) attacks rests on bulk-loaded settings and the chosen SOSD datasets; the manuscript should add an explicit ablation or discussion of how results change under incremental (non-bulk) insertions, as this directly tests whether the separation generalizes to fully dynamic workloads.
  2. [Results] Results: while dataset dependence is noted, the paper should include a per-dataset breakdown (e.g., table or figure) of the exact slowdown factors and the fraction of queries hitting poisoned regions for the dynamic ACA case; without this, the 'up to 2--2.8x' aggregate risks overstating uniformity across distributions.
minor comments (2)
  1. [Abstract] Abstract: the notation '2--2.8x' mixes en-dash and hyphen; standardize numerical ranges throughout the text and figures for consistency.
  2. [Experimental Setup] The description of control workloads is mentioned but could be expanded with a short paragraph or pseudocode in the experimental setup to clarify how adversarial effects are isolated from baseline performance.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments on our manuscript. We address each major comment point-by-point below, indicating the revisions we will make to strengthen the paper.

read point-by-point responses
  1. Referee: [Evaluation] Evaluation section: the central claim of a 'clear separation' between static (minimal impact) and dynamic (2--2.8x slowdown) attacks rests on bulk-loaded settings and the chosen SOSD datasets; the manuscript should add an explicit ablation or discussion of how results change under incremental (non-bulk) insertions, as this directly tests whether the separation generalizes to fully dynamic workloads.

    Authors: We agree that an explicit discussion of incremental insertions would help clarify the scope of our claims. Our study deliberately focuses on bulk-loaded settings because this is the standard initialization procedure for learned indexes such as ALEX, and it is the regime in which static poisoning attacks are most directly applicable. Dynamic ACA attacks are evaluated through adversarial inserts, but the overall performance comparison is anchored to the bulk-loaded baseline to isolate the effects of each threat model. We will add a paragraph in the Evaluation section explaining this design choice and noting that fully incremental workloads (with continuous model retraining) constitute an important avenue for future work, as they could change the attack surface. We will not introduce new experiments in this revision, as the current results already demonstrate the separation under the common bulk-loaded case. revision: partial

  2. Referee: [Results] Results: while dataset dependence is noted, the paper should include a per-dataset breakdown (e.g., table or figure) of the exact slowdown factors and the fraction of queries hitting poisoned regions for the dynamic ACA case; without this, the 'up to 2--2.8x' aggregate risks overstating uniformity across distributions.

    Authors: We accept this suggestion and will improve transparency. We will add a new table (or extended figure caption) in the Results section that provides a per-dataset breakdown of the exact lookup slowdown factors observed under dynamic ACA, together with the fraction of queries that hit poisoned regions (computed from our existing experimental logs). This will make the dataset dependence explicit and prevent any misinterpretation of the aggregate 'up to 2--2.8x' figure as uniform across all distributions. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper is a purely empirical study reporting experimental measurements of lookup throughput under static poisoning and dynamic ACA attacks on ALEX, using SOSD datasets and a real-key baseline. No derivations, equations, fitted parameters, or predictions are present that could reduce to inputs by construction. Central claims (separation between threat models, up to 2-2.8x slowdown, dataset dependence) are direct experimental outputs with explicit qualifications on representativeness and methodology. No self-citation load-bearing steps, ansatz smuggling, or renaming of known results occur. The evaluation is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on empirical observations under standard security threat models rather than new theoretical constructs, fitted parameters, or invented entities.

axioms (2)
  • domain assumption Adversaries can perform up to 200K targeted inserts under the defined static and dynamic threat models.
    Standard assumption in adversarial database security research invoked to bound the attack scale.
  • domain assumption The SOSD datasets and real-key baseline adequately represent practical key distributions for measuring attack impact.
    Invoked when interpreting dataset-dependent results and generalizability.

pith-pipeline@v0.9.0 · 5518 in / 1402 out tokens · 39387 ms · 2026-05-08T02:28:54.138389+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

15 extracted references · 9 canonical work pages

  1. [1]

    Abdullah Al-Mamun, Hao Wu, Qiyang He, Jianguo Wang, and Walid G. Aref. 2025. A Survey of Learned Indexes for the Multi-dimensional Space.ACM Comput. Surv.58, 4, Article 96 (Oct. 2025), 37 pages. doi:10.1145/3768575

  2. [2]

    Matthias Bachfischer, Renata Borovica-Gajic, and Benjamin I. P. Ru- binstein. 2022. Testing the Robustness of Learned Index Structures. arXiv:2207.11575 [cs.DB]https://arxiv.org/abs/2207.11575

  3. [3]

    Jialin Ding, Umar Farooq Minhas, Jia Yu, Chi Wang, Jaeyoung Do, Yinan Li, Hantian Zhang, Badrish Chandramouli, Johannes Gehrke, Donald Kossmann, David Lomet, and Tim Kraska. 2020. ALEX: An Updatable Adaptive Learned Index. InProceedings of the 2020 ACM SIG- MOD International Conference on Management of Data (SIGMOD/PODS ’20). ACM, 969–984. doi:10.1145/331...

  4. [4]

    Paolo Ferragina and Giorgio Vinciguerra. 2020. The PGM-index: a fully- dynamic compressed learned index with provable worst-case bounds. Proceedings of the VLDB Endowment13, 8 (April 2020), 1162–1175. doi:10.14778/3389133.3389135

  5. [5]

    Google. 2024. Abseil: An Open-Source Collection of C++ Libraries. https://abseil.io. Accessed: 2026-03-13

  6. [6]

    Google. 2024. Google Benchmark: A Microbenchmark Support Library. https://github.com/google/benchmark. Accessed: 2026-03-13

  7. [7]

    Andreas Kipf, Ryan Marcus, Alexander van Renen, Mihail Stoian, Alfons Kemper, Tim Kraska, and Thomas Neumann. 2019. SOSD: A Benchmark for Learned Indexes.NeurIPS Workshop on Machine Learning for Systems(2019)

  8. [8]

    Kornaropoulos, Silei Ren, and Roberto Tamassia

    Evgenios M. Kornaropoulos, Silei Ren, and Roberto Tamassia. 2022. The Price of Tailoring the Index to Your Data: Poisoning Attacks on Learned Index Structures. InProceedings of the 2022 International Con- ference on Management of Data(Philadelphia, PA, USA)(SIGMOD ’22). Association for Computing Machinery, New York, NY, USA, 1331–1344. doi:10.1145/3514221.3517867

  9. [9]

    Chi, Jeff Dean, and Neoklis Polyzotis

    Tim Kraska, Alex Beutel, Ed H. Chi, Jeff Dean, and Neoklis Polyzotis

  10. [10]

    The Case for Learned Index Structures.https://arxiv.org/abs/ 1712.01208

  11. [11]

    Pengfei Li, Yu Hua, Jingnan Jia, and Pengfei Zuo. 2021. FINEdex: a fine- grained learned index scheme for scalable and concurrent memory systems.Proc. VLDB Endow.15, 2 (Oct. 2021), 321–334. doi:10.14778/ 3489496.3489512

  12. [12]

    Ryan Marcus, Andreas Kipf, Alexander van Renen, Mihail Stoian, Sanchit Misra, Alfons Kemper, Thomas Neumann, and Tim Kraska

  13. [13]

    VLDB Endow.14, 1 (2020), 1–13

    Benchmarking Learned Indexes.Proc. VLDB Endow.14, 1 (2020), 1–13

  14. [14]

    Atsuki Sato, Martin Aumüller, and Yusuke Matsui. 2026. Math- ematical Foundations of Poisoning Attacks on Linear Regression over Cumulative Distribution Functions. arXiv:2603.00537 [cs.LG] https://arxiv.org/abs/2603.00537

  15. [15]

    Kornaropoulos, and Yue Cheng

    Rui Yang, Evgenios M. Kornaropoulos, and Yue Cheng. 2024. Algorithmic Complexity Attacks on Dynamic Learned Indexes. arXiv:2403.12433 [cs.DB]https://arxiv.org/abs/2403.12433 A Additional Visualizations Figures below provide value-space CDF plots for both at- tack variants and their real-key controls, complementing the clean-rank bin coverage analysis in F...