Polars inside Intel SGX2 Enclaves: An Empirical Study of Confidential Analytical Query Processing
Pith reviewed 2026-05-22 08:33 UTC · model grok-4.3
The pith
Polars DataFrames run inside SGX2 enclaves with roughly 1.5x overhead on TPC-H workloads, but data loading and lazy API choices dominate the total cost.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Running Polars inside SGX2 via Gramine on TPC-H SF30 yields an end-to-end overhead of 1.49-1.56x that remains nearly constant across four dataset-width configurations. Query-only overhead declines from 1.51-1.52x to 1.43-1.44x, while table-loading overhead increases from 2.27x to 4.07x. Median per-query slowdown is 1.45x with a maximum of 2.57x, and some queries exhibit run-to-run spikes attributed to EPC pressure. Lazy API execution is 2.25-2.27x faster overall than eager execution, which fails with out-of-memory errors at 41 GB and above.
What carries the argument
Separation of query-only runtime from full end-to-end runtime (including table loading from Azure Blob Storage) inside a Gramine-protected SGX2 enclave.
Load-bearing premise
The overhead numbers measured on the chosen Gramine SGX2 configuration with TPC-H SF30 data on Azure Blob Storage will hold for other storage setups, dataset shapes, and enclave implementations without unmeasured artifacts.
What would settle it
Re-running the identical Polars queries and loading steps on the same hardware outside any enclave and observing whether the reported 1.43-1.56x ratios reappear or shift substantially.
Figures
read the original abstract
Trusted Execution Environments (TEEs) have renewed interest in confidential analytics, but most prior evaluations focus on SQL database engines or earlier SGX generations. This paper studies an Arrow-native DataFrame engine, Polars, running inside Intel SGX2 enclaves via Gramine on TPC-H SF30 with Azure Blob Storage. We report both the standard TPC-H power score and a query-only variant that removes table-loading time in order to separate compute overhead from data-ingestion overhead. Across four dataset-width configurations (approximately 22-73 GB), end-to-end overhead remains nearly constant at 1.49-1.56$\times$, but this composite metric obscures two distinct behaviors: query-only overhead declines from 1.51-1.52$\times$ to 1.43-1.44$\times$, whereas table-loading overhead rises from 2.27$\times$ to 4.07$\times$. We further show that overhead is not uniform across queries: for the len130 configuration, the median per-query SGX slowdown is 1.45$\times$ with a maximum of 2.57$\times$, and a small set of queries exhibits pronounced run-to-run spikes consistent with stateful EPC pressure. Finally, we compare Polars' lazy and eager APIs under the same TEE setting. Lazy execution is 2.25-2.27$\times$ faster overall, while eager execution fails with out-of-memory errors at 41 GB and above. Relative to the recent DuckDB-SGX2 study, our results suggest that SGX2 can support Arrow-native analytical processing with a similar order of security overhead, but that load-path amplification and API-level optimization are first-order determinants of end-to-end performance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript empirically evaluates the performance of the Polars DataFrame library when executed inside Intel SGX2 trusted execution environments using the Gramine library. Experiments are conducted on the TPC-H benchmark at scale factor 30, with data stored in Azure Blob Storage, across four dataset width configurations ranging from approximately 22 to 73 GB. The authors report end-to-end overhead factors of 1.49-1.56× compared to native execution, with query-only overheads of 1.43-1.52× and loading overheads increasing from 2.27× to 4.07×. They also present per-query slowdown statistics, noting a median of 1.45× and max of 2.57× for one configuration, and compare lazy and eager execution APIs, finding lazy to be substantially faster and more memory-efficient.
Significance. If the measurements are robust, the work supplies concrete data on the viability of Arrow-native analytical processing inside SGX2 enclaves, extending prior SQL-centric TEE studies. By separating loading and query times and contrasting APIs, it identifies load-path amplification and execution strategy as practically significant factors, offering guidance for confidential analytics deployments.
major comments (2)
- The experimental description (abstract and results sections) provides limited detail on the number of runs per query, variance handling, outlier exclusion criteria, or statistical tests. Given the reported run-to-run spikes attributed to EPC pressure, this reduces confidence in the precision of the median 1.45× slowdown, the 1.49-1.56× end-to-end range, and the claim of nearly constant overhead across widths.
- The central claim that SGX2 supports Arrow-native processing with overheads of similar order to prior SQL studies, and that load-path amplification is a first-order determinant, rests on measurements taken exclusively under the Gramine runtime with Azure Blob Storage. While query-only timing and EPC-pressure observations are reported, the design does not isolate enclave-specific costs (EPC paging, attestation) from runtime syscall emulation or network I/O contributions, weakening the attribution and the comparison to DuckDB-SGX2.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which help clarify the experimental methodology and scope of our claims. We address each major comment below.
read point-by-point responses
-
Referee: The experimental description (abstract and results sections) provides limited detail on the number of runs per query, variance handling, outlier exclusion criteria, or statistical tests. Given the reported run-to-run spikes attributed to EPC pressure, this reduces confidence in the precision of the median 1.45× slowdown, the 1.49-1.56× end-to-end range, and the claim of nearly constant overhead across widths.
Authors: We agree that additional methodological detail is warranted. In the revised manuscript we will expand the experimental setup to state that each query was run five times and that medians are reported; we will explicitly note the observed run-to-run spikes and their attribution to EPC pressure; and we will clarify that no formal statistical hypothesis tests were performed because the study is descriptive rather than inferential. These additions will directly address the concern about precision of the reported ranges and median. revision: yes
-
Referee: The central claim that SGX2 supports Arrow-native processing with overheads of similar order to prior SQL studies, and that load-path amplification is a first-order determinant, rests on measurements taken exclusively under the Gramine runtime with Azure Blob Storage. While query-only timing and EPC-pressure observations are reported, the design does not isolate enclave-specific costs (EPC paging, attestation) from runtime syscall emulation or network I/O contributions, weakening the attribution and the comparison to DuckDB-SGX2.
Authors: We acknowledge that the evaluation is performed under Gramine with Azure Blob Storage and that we do not provide a fine-grained decomposition separating pure enclave costs from runtime emulation and network effects. The query-only versus loading separation is intended to surface load-path amplification as a practically relevant factor rather than to claim complete isolation. We will add an explicit limitations paragraph noting this scope and will qualify the DuckDB-SGX2 comparison as one of overall observed overhead magnitude under comparable TEE runtimes rather than a controlled component-wise attribution. The core empirical observation of roughly 1.5× end-to-end overhead remains supported by the data. revision: partial
Circularity Check
No circularity: pure empirical benchmarking with direct measurements
full rationale
This is an empirical study reporting observed timing measurements from running Polars inside SGX2 enclaves via Gramine on TPC-H SF30 datasets under Azure Blob Storage. All overhead figures (1.49-1.56× end-to-end, query-only variants, loading overheads, per-query medians) are obtained from direct instrumentation on the stated hardware/software stack across four dataset-width configurations. No derivations, equations, fitted parameters, predictions, or first-principles results exist that could reduce to inputs by construction. The comparison to the DuckDB-SGX2 study is external benchmarking, not a self-citation chain or ansatz. The central claims about load-path amplification and API choice follow from the measured data rather than being presupposed. The work is self-contained against external benchmarks with no load-bearing self-referential steps.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption TPC-H benchmark queries and the chosen scale factor 30 with four dataset widths are representative of real analytical workloads for measuring TEE overhead.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We report both the standard TPC-H power score and a query-only variant... end-to-end overhead remains nearly constant at 1.49–1.56×
-
IndisputableMonolith/Foundation/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
DuckDB-SGX2... identifies EPC paging as the dominant source
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Apache Arrow documentation.https://arrow.apache.org/, accessed 2026
Apache Arrow contributors. Apache Arrow documentation.https://arrow.apache.org/, accessed 2026
work page 2026
-
[2]
DCsv3 size series — Azure virtual machines
Microsoft Azure. DCsv3 size series — Azure virtual machines. https://learn.microsoft.co m/en-us/azure/virtual-machines/sizes/ge neral-purpose/dcsv3-series, accessed 2026
work page 2026
-
[3]
DuckDB-SGX2: The good, the bad and the ugly within confidential analytical query processing
Ilaria Battiston, Lotte Felius, Sam Ansmink, Laurens Kuiper, and Peter Boncz. DuckDB-SGX2: The good, the bad and the ugly within confidential analytical query processing. In Proceedings of the 20th International Workshop on Data Management on New Hardware (DaMoN), 2024
work page 2024
-
[4]
TPC-H analyzed: Hidden messages and lessons learned from an influential benchmark
Peter Boncz, Thomas Neumann, and Orri Erling. TPC-H analyzed: Hidden messages and lessons learned from an influential benchmark. In Technology Conference on Performance Evaluation and Benchmarking (TPCTC), 2013. 5 Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q21 Q22 1× 2× 4× 8× 16× 32× 64× 128× 121× TPC-H Query Eager / Lazy slowd...
work page 2013
-
[5]
Intel SGX explained.IACR Cryptology ePrint Archive, Report 2016/086, 2016
Victor Costan and Srinivas Devadas. Intel SGX explained.IACR Cryptology ePrint Archive, Report 2016/086, 2016
work page 2016
-
[6]
Benchmarking the second generation of Intel SGX hardware
Muhammad El-Hindi, Tobias Ziegler, Matthias Heinrich, Adrian Lutsch, Zheguang Zhao, and Carsten Binnig. Benchmarking the second generation of Intel SGX hardware. InProceedings of the 18th International Workshop on Data Management on New Hardware (DaMoN), 2022
work page 2022
-
[7]
Benny Fuhry, Jayanth Jain H. A., and Florian Kerschbaum. EncDBDB: Searchable encrypted, fast, compressed, in-memory database using enclaves. InProceedings of the 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2021
work page 2021
-
[8]
Gramine contributors. Gramine documentation. https://gramine.readthedocs.io/, accessed 2026
work page 2026
-
[9]
StealthDB: A scalable encrypted database with full SQL query support
Alexey Gribov, Dhinakaran Vinayagamurthy, and Sergey Gorbunov. StealthDB: A scalable encrypted database with full SQL query support. Proceedings on Privacy Enhancing Technologies, 2019(3), 2019
work page 2019
-
[10]
Apache Arrow DataFusion: A fast, embeddable, modular analytic query engine
Andrew Lamb, Yijie Shen, Dani"el Heres, Jayjeet Chakraborty, Mehmet Ozan Kabak, Liang-Chi Hsieh, and Chao Sun. Apache Arrow DataFusion: A fast, embeddable, modular analytic query engine. InCompanion of the 2024 International Conference on Management of Data (SIGMOD Companion), 2024
work page 2024
-
[11]
Apache Parquet contributors. Parquet modular encryption.https://parquet.apache.org/doc s/file-format/data-pages/encryption/, accessed 2026
work page 2026
-
[12]
Polars contributors. Polars user guide. https://docs.pola.rs/, accessed 2026
work page 2026
-
[13]
Lazy API.https://docs.p ola.rs/user-guide/concepts/lazy-api/, accessed 2026
Polars contributors. Lazy API.https://docs.p ola.rs/user-guide/concepts/lazy-api/, accessed 2026
work page 2026
-
[14]
Polars contributors. Optimizations. https://docs.pola.rs/user-guide/lazy/op timizations/, accessed 2026
work page 2026
-
[15]
EnclaveDB: A secure database using SGX
Christian Priebe, Kapil Vaswani, and Manuel Costa. EnclaveDB: A secure database using SGX. InIEEE Symposium on Security and Privacy (S&P), 2018
work page 2018
-
[16]
Intel Software Guard Extensions (Intel SGX) support for dynamic memory management inside an enclave
Intel Corporation. Intel Software Guard Extensions (Intel SGX) support for dynamic memory management inside an enclave. https://www.intel.com/content/www/us/en/ content-details/671178/, 2016
work page 2016
-
[17]
TPC-H benchmark specification, revision 3.0.1
Transaction Processing Performance Council. TPC-H benchmark specification, revision 3.0.1. https://www.tpc.org/tpc_documents_curren t_versions/pdf/tpc-h_v3.0.1.pdf, 2021
work page 2021
-
[18]
Chia-Che Tsai, Donald E. Porter, and Mona Vij. Graphene-SGX: A practical library OS for unmodified applications on SGX. InUSENIX Annual Technical Conference (ATC), 2017
work page 2017
-
[19]
Beekman, Raluca Ada Popa, Joseph E
Wenting Zheng, Ankur Dave, Jethro G. Beekman, Raluca Ada Popa, Joseph E. Gonzalez, and Ion Stoica. Opaque: An oblivious and encrypted 6 distributed analytics platform. InUSENIX Symposium on Networked Systems Design and Implementation (NSDI), 2017. 7
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.