pith. sign in

arxiv: 2606.17533 · v1 · pith:KN4LUG3Pnew · submitted 2026-06-16 · 💻 cs.CR

SNAS: A Multi-Layer Defense-in-Depth Architecture for Secure Egress in Sandboxed Workloads

Pith reviewed 2026-06-27 00:38 UTC · model grok-4.3

classification 💻 cs.CR
keywords secure egresssandboxed workloadseBPF packet filteringGENEVE overlayegress proxiesmulti-tenant isolationbandwidth limitingcloud network security
0
0 comments X

The pith

SNAS layers eBPF filtering, GENEVE overlays and egress proxies to control outbound traffic from sandboxed workloads with low overhead.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents SNAS as a production system that lets sandboxed user functions reach external APIs, databases and feature stores without violating isolation between different tenants. It achieves this through three combined layers: kernel packet filtering, virtual network encapsulation, and separate proxy servers that apply policies. The authors claim the approach adds little performance cost while supporting very large data movements. A reader would care because many cloud data and AI tasks need outside connectivity yet must remain strictly separated from one another.

Core claim

SNAS integrates eBPF packet filtering, GENEVE overlay networks, and distributed egress proxies to achieve policy-driven egress control from sandboxed workloads, including an eBPF-based bandwidth limiter that uses the Earliest Departure Time algorithm together with dual-tier policy enforcement and safeguards against connection limiting and port exhaustion.

What carries the argument

The three-layer stack of eBPF filtering with bandwidth limiting, GENEVE encapsulation, and distributed egress proxies for policy enforcement.

If this is right

  • Sandboxed workloads gain transparent access to external services while policies remain enforced at multiple points.
  • Bandwidth limiting and connection safeguards prevent any single workload from exhausting shared resources.
  • The same architecture supports production workloads across every region with petabyte-scale data movement.
  • Dual-tier enforcement separates coarse kernel rules from finer proxy-level decisions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same layering pattern could be examined for use in other multi-tenant sandbox platforms that face similar egress requirements.
  • The choice of the Earliest Departure Time algorithm for bandwidth control invites direct comparison with alternative fairness schedulers under real cloud traffic.
  • Additional public measurements of isolation strength under adversarial conditions would clarify how far the current safeguards extend.

Load-bearing premise

The layered combination of eBPF, GENEVE, and proxies preserves strict multi-tenant isolation and resource fairness while adding only low overhead.

What would settle it

A documented case of one tenant affecting another's traffic or a measured overhead that visibly slows petabyte-scale transfers would show the claims do not hold.

Figures

Figures reproduced from arXiv: 2606.17533 by Andong Zhan, Bhanu Prakash, Brandon Baker, Choden Konigsmark, Derek Denny-Brown, Elliott Brossard, Haowei Yu, Mike Halcrow, Niranjan Kumar Sharma, Nitya Kumar Sharma, Samy Boshra-Riad, Shawn Xia, S Muralidhar, Yuxiong He.

Figure 1
Figure 1. Figure 1: Snowflake architecture. Snowflake’s architecture [1], as a software-as-a-service (SaaS) solution, employs a multi-cluster shared data model that fundamentally decouples the data storage and compute layers to facilitate independent, elastic resource scaling, as illustrated in [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: SNAS Overlay network. the sandbox, while the other end remains in the com￾pute node namespace serving as the enforcement point. An eBPF program attached to the compute node side veth Traffic Control (TC) egress hook intercepts every packet, validating destination IP:port against the egress policy loaded in eBPF maps. Only authorized packets proceed; violations are logged and blocked immediately. This ensur… view at source ↗
Figure 2
Figure 2. Figure 2: SNAS architecture overview. SNAS is controlled through Network Rules and External Access Integrations [20]. Rules define allowed FQDNs, host￾names, or IP ranges; integrations group rules and secrets into policies referenced by UDFs. For each job, Cloud Services au￾thorizes access, resolves hostnames, compiles allowed IP:port tuples, signs the policy, and distributes it to warehouse nodes and egress proxies… view at source ↗
Figure 5
Figure 5. Figure 5: EDT algorithm. 2) Port Exhaustion Prevention. Our overlay architecture inherently mitigates port exhaustion. Each sandbox oper￾ates in an isolated network namespace with independent and limited port space, and GENEVE encapsulation uses a single UDP port regardless of sandbox count. Additional safeguards include port range limitation at egress proxy, aggressive connection timeouts to free resources promptly… view at source ↗
Figure 6
Figure 6. Figure 6: SNAS Policy Agent Logical Architecture [PITH_FULL_IMAGE:figures/full_fig_p004_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: EDT bandwidth limiter accuracy and multi-tenant fairness. Left: [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗
read the original abstract

Snowpark enables data engineering and AI/ML workloads in Snowflake by executing user-defined functions in secure sandboxes. Many of these workloads require external connectivity to access cloud APIs, external databases, or feature stores, creating a dependability challenge: how to provide transparent network access while preserving strict multi-tenant isolation and resource fairness. This paper presents Secure Network Access in Snowpark (SNAS), a production architecture for secure external communication from sandboxed workloads. SNAS combines Extended Berkeley Packet Filter (eBPF) packet filtering, Generic Network Virtualization Encapsulation (GENEVE) overlay networks, and distributed egress proxies for policy-driven egress control with low overhead. We describe the design, deployment, and measured production behavior of SNAS, including an eBPF-based bandwidth limiter using the Earliest Departure Time (EDT) algorithm, dual-tier policy enforcement, and safeguards for connection limiting and port exhaustion. SNAS is deployed across all Snowflake regions and supports large-scale production workloads including petabyte-scale data transfer and latency-sensitive external integrations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The paper presents SNAS, a multi-layer defense-in-depth architecture for secure egress from sandboxed Snowpark workloads in Snowflake. It integrates eBPF-based packet filtering and EDT bandwidth limiting, GENEVE overlay networks, and distributed egress proxies with dual-tier policy enforcement and connection safeguards. The system is deployed across all Snowflake regions and supports production workloads including petabyte-scale data transfers while aiming to preserve multi-tenant isolation and resource fairness with low overhead.

Significance. If the production measurements and isolation properties hold, the work provides a concrete, deployed example of combining kernel-level mechanisms (eBPF) with overlay networking for policy-driven egress control in large-scale multi-tenant sandboxes. The explicit use of the EDT algorithm for bandwidth limiting and the dual-tier enforcement are practical contributions that could inform similar systems in cloud environments.

minor comments (2)
  1. [§3.2] §3.2: The interaction between the eBPF EDT limiter and the GENEVE encapsulation layer is described at a high level; a diagram or timing diagram would clarify whether encapsulation occurs before or after the departure-time scheduling.
  2. [Table 2] Table 2: The reported overhead percentages lack error bars or the number of trials; adding these would strengthen the low-overhead claim.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive review and recommendation of minor revision. The assessment correctly identifies the practical contributions of combining eBPF mechanisms with overlay networking for egress control in multi-tenant sandboxes. No major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The manuscript is a high-level system architecture and deployment paper. It describes the combination of eBPF, GENEVE, and proxies, reports production measurements, and states deployment facts. No equations, derivations, fitted parameters presented as predictions, or load-bearing self-citations appear in the text. All claims rest on direct operational evidence rather than any reduction to prior inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only; no free parameters, axioms, or invented entities are described or can be extracted.

pith-pipeline@v0.9.1-grok · 5770 in / 1009 out tokens · 29942 ms · 2026-06-27T00:38:35.532516+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references

  1. [1]

    The snowflake elastic data warehouse,

    B. Dageville, T. Cruanes, M. Zukowski, V . Antonov, A. Avanes, J. Bock, J. Claybaugh, D. Engovatov, M. Hentschel, J. Huang, A. W. Lee, A. Mo- tivala, A. Q. Munir, S. Pelley, P. Povinec, G. Rahn, S. Triantafyllis, and P. Unterbrunner, “The snowflake elastic data warehouse,” inProceedings of the ACM SIGMOD International Conference on Management of Data, 201...

  2. [2]

    Snowpark: Performant, secure, user- friendly data engineering and ai/ml next to your data,

    B. Baker, E. Brossard, C. Xie, Z. Ye, D. Liu, Y . Xie, A. Zwiegincew, N. K. Sharma, G. Jain, E. Retunsky, M. Halcrow, D. Denny-Brown, I. Cseri, T. Akidau, and Y . He, “Snowpark: Performant, secure, user- friendly data engineering and ai/ml next to your data,” inProceedings of the IEEE International Conference on Distributed Computing Systems (ICDCS), 2025

  3. [3]

    Snowflake external functions,

    Snowflake Inc., “Snowflake external functions,” https://docs.snowflake. com/en/sql-reference/external-functions-introduction, 2024

  4. [4]

    Tcp small queues and earliest departure time,

    E. Dumazet, V . Jacobson, and Y . Cheng, “Tcp small queues and earliest departure time,” inLinux Plumbers Conference, 2018

  5. [5]

    A survey on dos/ddos mitigation techniques in sdns,

    B. Alhijawi, S. Almajali, H. Elgala, H. A. Bany Salameh, and M. Ayyash, “A survey on dos/ddos mitigation techniques in sdns,” Computers & Electrical Engineering, vol. 99, p. 107706, 2022

  6. [6]

    Cilium: ebpf-based networking, observability, security,

    Isovalent, “Cilium: ebpf-based networking, observability, security,” https://cilium.io/, 2024

  7. [7]

    The true cost of containing: A gvisor case study,

    E. G. Young, P. Zhu, T. Caraza-Harter, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau, “The true cost of containing: A gvisor case study,” in USENIX HotCloud, 2019

  8. [8]

    A quantitative study of firewall configuration errors,

    A. Wool, “A quantitative study of firewall configuration errors,”IEEE Computer, vol. 37, no. 6, pp. 62–67, 2004

  9. [9]

    Fireman: A toolkit for firewall modeling and analysis,

    L. Yuan, J. Mai, Z. Su, H. Chen, C.-N. Chuah, and P. Mohapatra, “Fireman: A toolkit for firewall modeling and analysis,” inIEEE Symposium on Security and Privacy, 2006

  10. [10]

    B4: Experience with a globally-deployed software defined wan,

    S. Jainet al., “B4: Experience with a globally-deployed software defined wan,” inACM SIGCOMM, 2013

  11. [11]

    Analysis and simulation of a fair queueing algorithm,

    A. Demers, S. Keshav, and S. Shenker, “Analysis and simulation of a fair queueing algorithm,” inProceedings of the ACM SIGCOMM Conference, 1989

  12. [12]

    Controlling queue delay,

    K. Nichols and V . Jacobson, “Controlling queue delay,”ACM Queue, vol. 10, no. 5, 2012

  13. [13]

    Vxlan: A framework for overlaying virtualized layer 2 networks over layer 3 networks,

    M. Mahalingam, D. Dutt, K. Duda, P. Agarwal, L. Kreeger, T. Srid- har, M. Bursell, and C. Wright, “Vxlan: A framework for overlaying virtualized layer 2 networks over layer 3 networks,” RFC 7348, 2014

  14. [14]

    Geneve: Generic network virtual- ization encapsulation,

    J. Gross, I. Ganga, and T. Sridhar, “Geneve: Generic network virtual- ization encapsulation,” IETF, RFC 8926, 2020

  15. [15]

    Firecracker: Lightweight virtual- ization for serverless applications,

    A. Agache, M. Brooker, A. Florescu, A. Iordache, A. Liguori, R. Neuge- bauer, P. Piwonka, and D.-M. Popa, “Firecracker: Lightweight virtual- ization for serverless applications,” inUSENIX NSDI, 2020

  16. [16]

    Sand: Towards high-performance serverless computing,

    I. E. Akkus, R. Chen, I. R. Wang, S. D. Gribble, and D. Kostic, “Sand: Towards high-performance serverless computing,” inProceedings of the USENIX Annual Technical Conference (ATC), 2018

  17. [17]

    Python and java support for serverless tasks,

    Snowflake Inc., “Python and java support for serverless tasks,” https: //docs.snowflake.com/en/user-guide/tasks-python-jvm, 2024

  18. [18]

    See++: Evolving snowpark execution environment for modern workloads,

    G. Jain, B. Baker, J. Yin, C. Xie, Z. Ye, S. Kulkarni, S. Abdelrahman, N. Qi, U. Shrestha, M. Halcrow, D. Bailey, and Y . He, “See++: Evolving snowpark execution environment for modern workloads,” in IEEE BigData, 2025

  19. [19]

    The linux kernel documentation – bpf,

    Linux Kernel Community, “The linux kernel documentation – bpf,” https://www.kernel.org/doc/html/latest/bpf/, 2024

  20. [20]

    Snowflake external network access overview,

    Snowflake Inc., “Snowflake external network access overview,” https: //docs.snowflake.com/en/developer-guide/external-network-access/ external-network-access-overview, 2024

  21. [21]

    Snowflake architecture and key concepts,

    ——, “Snowflake architecture and key concepts,” https://docs.snowflake. com/en/user-guide/intro-key-concepts, 2024

  22. [22]

    Creating complex network services with ebpf,

    S. Mianoet al., “Creating complex network services with ebpf,” inIEEE HPSR, 2018

  23. [23]

    The express data path: Fast programmable packet processing in the os kernel,

    T. Høiland-Jørgensenet al., “The express data path: Fast programmable packet processing in the os kernel,” inACM CoNEXT, 2018

  24. [24]

    Katran: A high-performance layer 4 load balancer,

    Facebook/Meta, “Katran: A high-performance layer 4 load balancer,” https://github.com/facebookincubator/katran, 2018

  25. [25]

    High- speed high-security signatures,

    D. J. Bernstein, N. Duif, T. Lange, P. Schwabe, and B.-Y . Yang, “High- speed high-security signatures,”Journal of Cryptographic Engineering, vol. 2, no. 2, pp. 77–89, 2012

  26. [26]

    Gregg,BPF Performance Tools: Linux System and Application Observability

    B. Gregg,BPF Performance Tools: Linux System and Application Observability. Addison-Wesley Professional, 2019

  27. [27]

    Serverless computing: One step forward, two steps back,

    J. M. Hellerstein, J. M. Faleiro, J. Gonzalez, J. Schleier-Smith, V . Sreekanti, A. Tumanov, and C. Wu, “Serverless computing: One step forward, two steps back,” inProceedings of the 9th Biennial Conference on Innovative Data Systems Research (CIDR), 2019

  28. [28]

    Retrieval-augmented generation for knowledge-intensive NLP tasks,

    P. Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. K¨uttler, M. Lewis, W.-t. Yih, T. Rockt¨aschel, S. Riedel, and D. Kiela, “Retrieval-augmented generation for knowledge-intensive NLP tasks,” in Advances in Neural Information Processing Systems (NeurIPS), 2020

  29. [29]

    Triggerflow: Trigger-based orchestration of serverless workflows,

    A. Arjona, P. Garc ´ıa-L´opez, J. Samp ´e, A. Slominski, and L. Villard, “Triggerflow: Trigger-based orchestration of serverless workflows,”Fu- ture Generation Computer Systems, vol. 124, pp. 215–229, 2021