pith. sign in

arxiv: 2605.21797 · v1 · pith:C73LAEDBnew · submitted 2026-05-20 · 💻 cs.CR · cs.DB

Polars inside Intel SGX2 Enclaves: An Empirical Study of Confidential Analytical Query Processing

Pith reviewed 2026-05-22 08:33 UTC · model grok-4.3

classification 💻 cs.CR cs.DB
keywords SGX2PolarsTrusted Execution Environmentsconfidential analyticsTPC-HArrow DataFrameslazy evaluationGramine
0
0 comments X

The pith

Polars DataFrames run inside SGX2 enclaves with roughly 1.5x overhead on TPC-H workloads, but data loading and lazy API choices dominate the total cost.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper measures an Arrow-native DataFrame engine running inside Intel SGX2 enclaves using Gramine. It separates query execution time from table-loading time on TPC-H SF30 datasets stored on Azure Blob Storage. End-to-end slowdown stays near 1.5x across dataset sizes, yet query-only slowdown drops slightly while loading overhead rises sharply with wider tables. Lazy execution proves far more efficient than eager execution and avoids out-of-memory failures at larger scales. The work concludes that SGX2 can host such analytical processing at overhead levels comparable to earlier SQL studies, provided the data path and execution API are chosen carefully.

Core claim

Running Polars inside SGX2 via Gramine on TPC-H SF30 yields an end-to-end overhead of 1.49-1.56x that remains nearly constant across four dataset-width configurations. Query-only overhead declines from 1.51-1.52x to 1.43-1.44x, while table-loading overhead increases from 2.27x to 4.07x. Median per-query slowdown is 1.45x with a maximum of 2.57x, and some queries exhibit run-to-run spikes attributed to EPC pressure. Lazy API execution is 2.25-2.27x faster overall than eager execution, which fails with out-of-memory errors at 41 GB and above.

What carries the argument

Separation of query-only runtime from full end-to-end runtime (including table loading from Azure Blob Storage) inside a Gramine-protected SGX2 enclave.

Load-bearing premise

The overhead numbers measured on the chosen Gramine SGX2 configuration with TPC-H SF30 data on Azure Blob Storage will hold for other storage setups, dataset shapes, and enclave implementations without unmeasured artifacts.

What would settle it

Re-running the identical Polars queries and loading steps on the same hardware outside any enclave and observing whether the reported 1.43-1.56x ratios reappear or shift substantially.

Figures

Figures reproduced from arXiv: 2605.21797 by Burns Smith, Kenny Leftin, Wei Wang.

Figure 1
Figure 1. Figure 1: TPC-H power scores for the Polars lazy API across four dataset widths. The aggregate overhead is [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Table-loading time versus dataset size. The [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Per-query SGX overhead for len130. Most queries cluster around moderate slowdown, while Q04 is a [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Per-query eager/lazy slowdown for len80 on a [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
read the original abstract

Trusted Execution Environments (TEEs) have renewed interest in confidential analytics, but most prior evaluations focus on SQL database engines or earlier SGX generations. This paper studies an Arrow-native DataFrame engine, Polars, running inside Intel SGX2 enclaves via Gramine on TPC-H SF30 with Azure Blob Storage. We report both the standard TPC-H power score and a query-only variant that removes table-loading time in order to separate compute overhead from data-ingestion overhead. Across four dataset-width configurations (approximately 22-73 GB), end-to-end overhead remains nearly constant at 1.49-1.56$\times$, but this composite metric obscures two distinct behaviors: query-only overhead declines from 1.51-1.52$\times$ to 1.43-1.44$\times$, whereas table-loading overhead rises from 2.27$\times$ to 4.07$\times$. We further show that overhead is not uniform across queries: for the len130 configuration, the median per-query SGX slowdown is 1.45$\times$ with a maximum of 2.57$\times$, and a small set of queries exhibits pronounced run-to-run spikes consistent with stateful EPC pressure. Finally, we compare Polars' lazy and eager APIs under the same TEE setting. Lazy execution is 2.25-2.27$\times$ faster overall, while eager execution fails with out-of-memory errors at 41 GB and above. Relative to the recent DuckDB-SGX2 study, our results suggest that SGX2 can support Arrow-native analytical processing with a similar order of security overhead, but that load-path amplification and API-level optimization are first-order determinants of end-to-end performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript empirically evaluates the performance of the Polars DataFrame library when executed inside Intel SGX2 trusted execution environments using the Gramine library. Experiments are conducted on the TPC-H benchmark at scale factor 30, with data stored in Azure Blob Storage, across four dataset width configurations ranging from approximately 22 to 73 GB. The authors report end-to-end overhead factors of 1.49-1.56× compared to native execution, with query-only overheads of 1.43-1.52× and loading overheads increasing from 2.27× to 4.07×. They also present per-query slowdown statistics, noting a median of 1.45× and max of 2.57× for one configuration, and compare lazy and eager execution APIs, finding lazy to be substantially faster and more memory-efficient.

Significance. If the measurements are robust, the work supplies concrete data on the viability of Arrow-native analytical processing inside SGX2 enclaves, extending prior SQL-centric TEE studies. By separating loading and query times and contrasting APIs, it identifies load-path amplification and execution strategy as practically significant factors, offering guidance for confidential analytics deployments.

major comments (2)
  1. The experimental description (abstract and results sections) provides limited detail on the number of runs per query, variance handling, outlier exclusion criteria, or statistical tests. Given the reported run-to-run spikes attributed to EPC pressure, this reduces confidence in the precision of the median 1.45× slowdown, the 1.49-1.56× end-to-end range, and the claim of nearly constant overhead across widths.
  2. The central claim that SGX2 supports Arrow-native processing with overheads of similar order to prior SQL studies, and that load-path amplification is a first-order determinant, rests on measurements taken exclusively under the Gramine runtime with Azure Blob Storage. While query-only timing and EPC-pressure observations are reported, the design does not isolate enclave-specific costs (EPC paging, attestation) from runtime syscall emulation or network I/O contributions, weakening the attribution and the comparison to DuckDB-SGX2.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the experimental methodology and scope of our claims. We address each major comment below.

read point-by-point responses
  1. Referee: The experimental description (abstract and results sections) provides limited detail on the number of runs per query, variance handling, outlier exclusion criteria, or statistical tests. Given the reported run-to-run spikes attributed to EPC pressure, this reduces confidence in the precision of the median 1.45× slowdown, the 1.49-1.56× end-to-end range, and the claim of nearly constant overhead across widths.

    Authors: We agree that additional methodological detail is warranted. In the revised manuscript we will expand the experimental setup to state that each query was run five times and that medians are reported; we will explicitly note the observed run-to-run spikes and their attribution to EPC pressure; and we will clarify that no formal statistical hypothesis tests were performed because the study is descriptive rather than inferential. These additions will directly address the concern about precision of the reported ranges and median. revision: yes

  2. Referee: The central claim that SGX2 supports Arrow-native processing with overheads of similar order to prior SQL studies, and that load-path amplification is a first-order determinant, rests on measurements taken exclusively under the Gramine runtime with Azure Blob Storage. While query-only timing and EPC-pressure observations are reported, the design does not isolate enclave-specific costs (EPC paging, attestation) from runtime syscall emulation or network I/O contributions, weakening the attribution and the comparison to DuckDB-SGX2.

    Authors: We acknowledge that the evaluation is performed under Gramine with Azure Blob Storage and that we do not provide a fine-grained decomposition separating pure enclave costs from runtime emulation and network effects. The query-only versus loading separation is intended to surface load-path amplification as a practically relevant factor rather than to claim complete isolation. We will add an explicit limitations paragraph noting this scope and will qualify the DuckDB-SGX2 comparison as one of overall observed overhead magnitude under comparable TEE runtimes rather than a controlled component-wise attribution. The core empirical observation of roughly 1.5× end-to-end overhead remains supported by the data. revision: partial

Circularity Check

0 steps flagged

No circularity: pure empirical benchmarking with direct measurements

full rationale

This is an empirical study reporting observed timing measurements from running Polars inside SGX2 enclaves via Gramine on TPC-H SF30 datasets under Azure Blob Storage. All overhead figures (1.49-1.56× end-to-end, query-only variants, loading overheads, per-query medians) are obtained from direct instrumentation on the stated hardware/software stack across four dataset-width configurations. No derivations, equations, fitted parameters, predictions, or first-principles results exist that could reduce to inputs by construction. The comparison to the DuckDB-SGX2 study is external benchmarking, not a self-citation chain or ansatz. The central claims about load-path amplification and API choice follow from the measured data rather than being presupposed. The work is self-contained against external benchmarks with no load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

This is an empirical performance study with no mathematical derivations. The central claims rest on the representativeness of TPC-H queries and the fidelity of the Gramine SGX2 environment to production TEE usage.

axioms (1)
  • domain assumption TPC-H benchmark queries and the chosen scale factor 30 with four dataset widths are representative of real analytical workloads for measuring TEE overhead.
    Invoked when generalizing the reported 1.49-1.56x overhead and per-query variations to broader confidential analytics use cases.

pith-pipeline@v0.9.0 · 5856 in / 1464 out tokens · 38919 ms · 2026-05-22T08:33:24.850086+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages

  1. [1]

    Apache Arrow documentation.https://arrow.apache.org/, accessed 2026

    Apache Arrow contributors. Apache Arrow documentation.https://arrow.apache.org/, accessed 2026

  2. [2]

    DCsv3 size series — Azure virtual machines

    Microsoft Azure. DCsv3 size series — Azure virtual machines. https://learn.microsoft.co m/en-us/azure/virtual-machines/sizes/ge neral-purpose/dcsv3-series, accessed 2026

  3. [3]

    DuckDB-SGX2: The good, the bad and the ugly within confidential analytical query processing

    Ilaria Battiston, Lotte Felius, Sam Ansmink, Laurens Kuiper, and Peter Boncz. DuckDB-SGX2: The good, the bad and the ugly within confidential analytical query processing. In Proceedings of the 20th International Workshop on Data Management on New Hardware (DaMoN), 2024

  4. [4]

    TPC-H analyzed: Hidden messages and lessons learned from an influential benchmark

    Peter Boncz, Thomas Neumann, and Orri Erling. TPC-H analyzed: Hidden messages and lessons learned from an influential benchmark. In Technology Conference on Performance Evaluation and Benchmarking (TPCTC), 2013. 5 Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q21 Q22 1× 2× 4× 8× 16× 32× 64× 128× 121× TPC-H Query Eager / Lazy slowd...

  5. [5]

    Intel SGX explained.IACR Cryptology ePrint Archive, Report 2016/086, 2016

    Victor Costan and Srinivas Devadas. Intel SGX explained.IACR Cryptology ePrint Archive, Report 2016/086, 2016

  6. [6]

    Benchmarking the second generation of Intel SGX hardware

    Muhammad El-Hindi, Tobias Ziegler, Matthias Heinrich, Adrian Lutsch, Zheguang Zhao, and Carsten Binnig. Benchmarking the second generation of Intel SGX hardware. InProceedings of the 18th International Workshop on Data Management on New Hardware (DaMoN), 2022

  7. [7]

    A., and Florian Kerschbaum

    Benny Fuhry, Jayanth Jain H. A., and Florian Kerschbaum. EncDBDB: Searchable encrypted, fast, compressed, in-memory database using enclaves. InProceedings of the 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2021

  8. [8]

    Gramine documentation

    Gramine contributors. Gramine documentation. https://gramine.readthedocs.io/, accessed 2026

  9. [9]

    StealthDB: A scalable encrypted database with full SQL query support

    Alexey Gribov, Dhinakaran Vinayagamurthy, and Sergey Gorbunov. StealthDB: A scalable encrypted database with full SQL query support. Proceedings on Privacy Enhancing Technologies, 2019(3), 2019

  10. [10]

    Apache Arrow DataFusion: A fast, embeddable, modular analytic query engine

    Andrew Lamb, Yijie Shen, Dani"el Heres, Jayjeet Chakraborty, Mehmet Ozan Kabak, Liang-Chi Hsieh, and Chao Sun. Apache Arrow DataFusion: A fast, embeddable, modular analytic query engine. InCompanion of the 2024 International Conference on Management of Data (SIGMOD Companion), 2024

  11. [11]

    Parquet modular encryption.https://parquet.apache.org/doc s/file-format/data-pages/encryption/, accessed 2026

    Apache Parquet contributors. Parquet modular encryption.https://parquet.apache.org/doc s/file-format/data-pages/encryption/, accessed 2026

  12. [12]

    Polars user guide

    Polars contributors. Polars user guide. https://docs.pola.rs/, accessed 2026

  13. [13]

    Lazy API.https://docs.p ola.rs/user-guide/concepts/lazy-api/, accessed 2026

    Polars contributors. Lazy API.https://docs.p ola.rs/user-guide/concepts/lazy-api/, accessed 2026

  14. [14]

    Optimizations

    Polars contributors. Optimizations. https://docs.pola.rs/user-guide/lazy/op timizations/, accessed 2026

  15. [15]

    EnclaveDB: A secure database using SGX

    Christian Priebe, Kapil Vaswani, and Manuel Costa. EnclaveDB: A secure database using SGX. InIEEE Symposium on Security and Privacy (S&P), 2018

  16. [16]

    Intel Software Guard Extensions (Intel SGX) support for dynamic memory management inside an enclave

    Intel Corporation. Intel Software Guard Extensions (Intel SGX) support for dynamic memory management inside an enclave. https://www.intel.com/content/www/us/en/ content-details/671178/, 2016

  17. [17]

    TPC-H benchmark specification, revision 3.0.1

    Transaction Processing Performance Council. TPC-H benchmark specification, revision 3.0.1. https://www.tpc.org/tpc_documents_curren t_versions/pdf/tpc-h_v3.0.1.pdf, 2021

  18. [18]

    Porter, and Mona Vij

    Chia-Che Tsai, Donald E. Porter, and Mona Vij. Graphene-SGX: A practical library OS for unmodified applications on SGX. InUSENIX Annual Technical Conference (ATC), 2017

  19. [19]

    Beekman, Raluca Ada Popa, Joseph E

    Wenting Zheng, Ankur Dave, Jethro G. Beekman, Raluca Ada Popa, Joseph E. Gonzalez, and Ion Stoica. Opaque: An oblivious and encrypted 6 distributed analytics platform. InUSENIX Symposium on Networked Systems Design and Implementation (NSDI), 2017. 7