Public-Decay Homomorphic State Space Models for Private Sequence Inference

Luis Brito

arxiv: 2605.16647 · v1 · pith:RIROQFNGnew · submitted 2026-05-15 · 💻 cs.CR · cs.LG

Public-Decay Homomorphic State Space Models for Private Sequence Inference

Luis Brito This is my paper

Pith reviewed 2026-05-20 16:05 UTC · model grok-4.3

classification 💻 cs.CR cs.LG

keywords homomorphic encryptionstate space modelsprivate sequence inferencefully homomorphic encryptionsentiment classificationfastText embeddingsencrypted computationpublic decay

0 comments

The pith

Public-decay homomorphic state space models match plaintext accuracy on encrypted sequence tasks while running five times faster than polynomial attention.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces public-decay homomorphic state space models that keep a fixed encrypted state and update it through ciphertext-plaintext operations instead of full ciphertext multiplications. This design separates client-side tokenization and encryption from server-side evaluation over bounded features, allowing sequence inference under fully homomorphic encryption without heavy bootstrapping. On full Rotten Tomatoes and SST-2 validation sets the encrypted path exactly reproduces plaintext classifications at 0.7505 and 0.7420 accuracy. The same workloads run roughly five times faster than HE-friendly polynomial attention while using lower multiplicative depth and smaller ring sizes.

Core claim

Public-decay homomorphic state space models carry a fixed encrypted state that is updated by public ciphertext-plaintext decay on a local write path, while all ciphertext-ciphertext work stays minimal. This yields exact plaintext-matching classifications on the complete Rotten Tomatoes and SST-2 validation splits at 0.7505 and 0.7420 accuracy, with approximately 5x lower latency than polynomial attention on identical fastText workloads and lower logical state footprint.

What carries the argument

Public-decay operation that updates the fixed encrypted state through ciphertext-plaintext multiplication while confining ciphertext-ciphertext multiplication to the local write path.

If this is right

Encrypted sequence inference becomes practical for sentiment tasks without full-sequence polynomial attention costs.
Latency drops to 1.34-1.62x lower than cached final-token attention and 30-258x lower than full-sequence attention.
Models succeed at lower depth (8) and smaller ring size (32768) than projected attention (depth 10, ring 65536).
Client-server separation works cleanly for frozen fastText features with clipping and thresholding on the client.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach may scale to other bounded-projection tasks such as topic classification if public decay continues to control noise.
Designers of future FHE models could prioritize public operations to keep multiplicative depth below bootstrapping thresholds.
Testing on datasets with longer average sequences would directly check whether the fixed-state assumption holds beyond current lengths.

Load-bearing premise

Public decay between ciphertext and plaintext preserves both semantic content and cryptographic security over full sequence lengths without extra bootstrapping or unacceptable noise growth.

What would settle it

A measurable drop in accuracy or detectable noise accumulation when the same model processes sequences longer than the tested Rotten Tomatoes and SST-2 lengths would show the public-decay assumption fails.

Figures

Figures reproduced from arXiv: 2605.16647 by Luis Brito.

**Figure 2.** Figure 2: Model-level encrypted memory scaling. HSSM retains one recurrent state ciphertext, [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Context-length stress sweep under a fixed FIDESlib CUDA profile, with the adaptive [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗

read the original abstract

Fully homomorphic encryption (FHE) changes sequence-model design because rotations, encrypted products, ciphertext materialization, multiplicative depth, and bootstrapping pressure can dominate ordinary neural-network costs. This paper presents public-decay homomorphic state space models (HSSMs), recurrent/state-space blocks whose carried state is updated through ciphertext-plaintext public decay while ciphertext-ciphertext multiplication remains on a local write path. The design keeps a fixed encrypted state across the sequence. The evaluated workflow separates client-side tokenization, frozen fastText lookup, projection, clipping, encryption, decryption, and thresholding from server-side encrypted evaluation over bounded projected features. On full Rotten Tomatoes and SST-2 validation splits, the encrypted HSSM path exactly matches plaintext classifications and reaches 0.7505 and 0.7420 accuracy. Against HE-friendly polynomial attention on the same fastText workloads, HSSM matches or exceeds full-sequence task quality while running about 5x faster. Paired L40S operation-level rows show 1.34-1.62x lower latency than cached final-token polynomial attention, 30-258x lower latency than full-sequence polynomial attention, and lower logical encrypted-state footprint. A T = 16/32 comparator with encrypted public-linear input and Q/K/V projections shows projected HSSM succeeding under depth 8/ring 32768, while projected attention succeeds under depth 10/ring 65536. A matched T = 8 OpenFHE/FIDESlib trace finishes at final level 3 and noise-scale degree 2 on both backends. These results make public-decay carry a practical FHE co-design lever for encrypted sequence inference from bounded projected features.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives a public-decay carry for homomorphic state space models that keeps the encrypted state fixed and reports exact accuracy parity plus 5x speedups over polynomial attention on the tested sentiment tasks.

read the letter

The main point is that this work shows how to run state space models under FHE by updating a fixed encrypted state through public decay operations while keeping ciphertext-ciphertext multiplications on a local path. That construction appears new relative to the polynomial attention baselines and lets them hold a constant state footprint across the sequence. On the full Rotten Tomatoes and SST-2 validation sets the encrypted path matches plaintext accuracy exactly at 0.7505 and 0.7420, and they measure roughly 5x lower latency than the HE-friendly polynomial attention baseline on the same fastText workloads, plus lower numbers against both cached and full-sequence attention variants on L40S hardware. The depth-8 / ring-32768 result for T=16/32 and the matched OpenFHE trace ending at level 3 with noise-scale degree 2 are useful concrete data points. The client-server split with frozen embeddings and bounded projections is laid out clearly enough to see where the savings come from. The public-decay idea is the part that stands out as a practical co-design lever rather than another generic polynomial replacement. The soft spots sit mainly in the verification details. The abstract gives no per-step noise-growth formula, no explicit bound on the decay coefficient, and no error bars or projection ablations, so the claim that depth stays constant for arbitrary lengths rests on the small-T traces and the bounded-projection assumption. The stress-test worry about linear noise growth with sequence length is reasonable to check in the full text; if the paper only demonstrates it for T up to 32 then longer sequences would need separate confirmation. No formal security argument is summarized either. This is for people working on FHE sequence inference who want alternatives to attention that keep state size and depth manageable. A reader already familiar with OpenFHE or similar libraries would get the most out of the latency and depth numbers. It deserves a serious referee because the empirical comparisons are direct and the carry mechanism looks like a real addition worth testing and tightening.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces public-decay homomorphic state space models (HSSMs) for private sequence inference under fully homomorphic encryption. The core design updates a fixed encrypted state via ciphertext-plaintext public decay operations while restricting ciphertext-ciphertext multiplications to a local write path. Using frozen fastText features with client-side tokenization, projection, clipping, and encryption, the encrypted HSSM achieves exact accuracy parity with plaintext on the full Rotten Tomatoes and SST-2 validation splits (0.7505 and 0.7420). It reports roughly 5x speedup over HE-friendly polynomial attention on the same workloads, with 1.34-1.62x lower latency than cached final-token attention and 30-258x lower latency than full-sequence attention, plus a smaller logical encrypted-state footprint. The approach is shown to succeed under depth 8/ring 32768 for T=16/32, with a matched T=8 OpenFHE/FIDESlib trace ending at level 3 and noise-scale degree 2.

Significance. If the public-decay carry mechanism maintains both semantic fidelity and cryptographic security without unacceptable noise growth or extra bootstrapping across arbitrary sequence lengths, the work offers a practical FHE co-design lever that reduces multiplicative depth and latency relative to attention-based alternatives. The exact accuracy match on standard datasets, concrete L40S latency rows, and reproducible OpenFHE/FIDESlib trace are verifiable strengths that support the empirical claims. These elements could guide future encrypted sequence model designs, though the absence of formal security arguments and general noise bounds limits broader impact.

major comments (2)

Abstract: The headline result of exact plaintext match at 0.7505/0.7420 accuracy and 5x speedup rests on the fixed encrypted state being updated solely via public decay without bootstrapping or unacceptable noise growth. The abstract states this works under depth 8/ring 32768 for T=16/32 and reports a matched OpenFHE trace ending at level 3 / noise-scale degree 2, but provides no per-step noise-growth formula, no explicit bound on the decay coefficient, and no verification that the bounded-projection clipping keeps the effective multiplicative depth constant across arbitrary sequence lengths. If the public decay is a scalar multiplication or addition whose noise variance grows linearly with T, the final noise could exceed the decryption threshold even if the reported trace for small T succeeds.
Abstract: The T=16/32 comparator claims projected HSSM succeeds under depth 8/ring 32768 while projected attention requires depth 10/ring 65536. Without an explicit accounting of how the encrypted public-linear input and Q/K/V projections interact with the public-decay updates, it is unclear whether the reported depth advantage generalizes or depends on parameter choices specific to the evaluated workloads.

minor comments (2)

Abstract: The reported accuracies of 0.7505 and 0.7420 are presented without error bars or variance across runs, which would strengthen the claim of exact match to plaintext classifications.
Abstract: The client-server workflow separation (tokenization, frozen fastText lookup, projection, clipping, encryption, decryption, thresholding) is described at a high level; additional detail on how clipping interacts with the encryption step and bounded features would improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough review and constructive comments. We address the two major comments point by point below. We agree that additional clarification on noise characteristics and depth accounting will improve the manuscript and will revise accordingly.

read point-by-point responses

Referee: Abstract: The headline result of exact plaintext match at 0.7505/0.7420 accuracy and 5x speedup rests on the fixed encrypted state being updated solely via public decay without bootstrapping or unacceptable noise growth. The abstract states this works under depth 8/ring 32768 for T=16/32 and reports a matched OpenFHE trace ending at level 3 / noise-scale degree 2, but provides no per-step noise-growth formula, no explicit bound on the decay coefficient, and no verification that the bounded-projection clipping keeps the effective multiplicative depth constant across arbitrary sequence lengths. If the public decay is a scalar multiplication or addition whose noise variance grows linearly with T, the final noise could exceed the decryption threshold even if the reported trace for small T succeeds.

Authors: We acknowledge the referee's concern. The public-decay update is a fixed-coefficient ciphertext-plaintext multiplication applied to a constant-size encrypted state; because no new ciphertext-ciphertext multiplications are introduced per token and the state dimension is independent of T, the effective multiplicative depth remains constant. Client-side bounded projection and clipping keep feature magnitudes within a fixed interval that prevents linear noise accumulation from exceeding the decryption threshold. The provided OpenFHE/FIDESlib trace for T=8 (final level 3, noise-scale degree 2) supplies empirical confirmation for the evaluated regime. While the current manuscript does not include a closed-form per-step noise formula, the design invariants ensure bounded growth. We will add a concise explanation of these invariants and the clipping role to the revised abstract and a short supplementary note. revision: yes
Referee: Abstract: The T=16/32 comparator claims projected HSSM succeeds under depth 8/ring 32768 while projected attention requires depth 10/ring 65536. Without an explicit accounting of how the encrypted public-linear input and Q/K/V projections interact with the public-decay updates, it is unclear whether the reported depth advantage generalizes or depends on parameter choices specific to the evaluated workloads.

Authors: The depth reduction is structural rather than workload-specific. Both comparators use identical encrypted public-linear input and projection steps; the HSSM advantage stems from confining all ciphertext-ciphertext multiplications to the single local write path while the fixed-size state is updated via public operations. Polynomial attention, by contrast, incurs additional multiplications for Q/K/V and score computation that scale with sequence length. Because the HSSM state size and write-path multiplications are independent of T, the depth saving (depth 8 vs. 10, ring 32768 vs. 65536) generalizes to other lengths under the same projection bounds. We will insert a brief clarifying paragraph on this interaction in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical accuracy and latency claims rest on direct dataset evaluation

full rationale

The paper reports concrete accuracy numbers (0.7505 on Rotten Tomatoes, 0.7420 on SST-2) obtained by running the encrypted HSSM path on full public validation splits and comparing against plaintext classifications. These are external empirical measurements, not quantities defined in terms of fitted parameters or prior self-referential equations. The design description (fixed encrypted state, public-decay updates, bounded projections) is presented as an engineering choice whose correctness is verified by OpenFHE traces rather than derived from any ansatz or uniqueness theorem internal to the work. No load-bearing step reduces to a self-definition or a prediction that is statistically forced by the inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on standard FHE noise-growth assumptions and the novel public-decay design element; no numeric free parameters are fitted in the reported results.

axioms (1)

domain assumption Fully homomorphic encryption supports the required ciphertext-plaintext multiplications and additions without excessive noise growth for the sequence length under the chosen depth and ring size.
Invoked by the claim that the encrypted HSSM path exactly matches plaintext classifications at the stated depths.

invented entities (1)

public-decay carry mechanism no independent evidence
purpose: To update the carried state through ciphertext-plaintext public decay while keeping a fixed encrypted state across the sequence.
New design primitive introduced to address FHE cost pressures on rotations and multiplicative depth.

pith-pipeline@v0.9.0 · 5830 in / 1515 out tokens · 68753 ms · 2026-05-20T16:05:40.639558+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

14 extracted references · 14 canonical work pages · 2 internal anchors

[1]

Findings of the Association for Computational Linguistics: ACL 2022

THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption. Findings of the Association for Computational Linguistics: ACL 2022. 2022.https://aclanthology.org/ 2022.findings-acl.277/

work page 2022
[2]

Proceedings of the 63rd Annual Meeting of the Association for Compu- tational Linguistics

Powerformer: Efficient and High-Accuracy Privacy-Preserving Language Model with Homo- morphic Encryption. Proceedings of the 63rd Annual Meeting of the Association for Compu- tational Linguistics. 2025.https://aclanthology.org/2025.acl-long.543/

work page 2025
[3]

Proceedings of the 42nd International Conference on Machine Learning

EncryptedLLM: Privacy-Preserving Large Language Model Inference via GPU-Accelerated Fully Homomorphic Encryption. Proceedings of the 42nd International Conference on Machine Learning. 2025.https://proceedings.mlr.press/v267/de-castro25a.html

work page 2025
[4]

ICLR 2026 Poster

MOAI: Module-Optimizing Architecture for Non-Interactive Secure Transformer Inference. ICLR 2026 Poster. 2026.https://openreview.net/forum?id=qJn4HtTzhH

work page 2026
[5]

2026.https://arxiv.org/abs/2602.11470

Cachemir: Fully Homomorphic Encrypted Inference of Generative Large Language Model with KV Cache. 2026.https://arxiv.org/abs/2602.11470

work page arXiv 2026
[6]

AEGIS: Scaling Long-Sequence Homomorphic Encrypted Transformer Inference via Hybrid Parallelism on Multi-GPU Systems

AEGIS: Scaling Long-Sequence Homomorphic Encrypted Transformer Inference via Hybrid Parallelism on Multi-GPU Systems. Accepted at ICS 2026. 2026.https://arxiv.org/abs/ 2604.03425

work page internal anchor Pith review Pith/arXiv arXiv 2026
[7]

IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2026, Sydney, Australia, January 31 - Feb

PriTran: Privacy-Preserving Inference for Transformer-Based Language Models under Fully Homomorphic Encryption. IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2026, Sydney, Australia, January 31 - Feb. 4, 2026. 2026.https: //doi.org/10.1109/CGO68049.2026.11395232. 18

work page doi:10.1109/cgo68049.2026.11395232 2026
[8]

2023.https://arxiv.org/abs/2308.05629

Inhibitor Transformers and Gated RNNs for Torus Efficient Fully Homomorphic Encryption. 2023.https://arxiv.org/abs/2308.05629

work page arXiv 2023
[9]

ICLR 2022

Efficiently Modeling Long Sequences with Structured State Spaces. ICLR 2022. 2022.https: //openreview.net/forum?id=uYLFoz1vlAC

work page 2022
[10]

NeurIPS 2022

Diagonal State Spaces are as Effective as Structured State Spaces. NeurIPS 2022. 2022.https: //openreview.net/forum?id=RjS0j6tsSrf

work page 2022
[11]

ICLR 2023

Simplified State Space Layers for Sequence Modeling. ICLR 2023. 2023.https://openreview. net/forum?id=Ai8Hw3AXqks

work page 2023
[12]

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Mamba: Linear-Time Sequence Modeling with Selective State Spaces. 2023.https://arxiv. org/abs/2312.00752

work page internal anchor Pith review Pith/arXiv arXiv 2023
[13]

International Conference on Machine Learning

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Struc- tured State Space Duality. International Conference on Machine Learning. 2024.https: //proceedings.mlr.press/

work page 2024
[14]

Applied Soft Computing

MPCMamba: Privacy-preserving inference for Mamba models via secure multi-party compu- tation. Applied Soft Computing. 2025.https://www.sciencedirect.com/science/article/ pii/S156849462501172X. 19

work page 2025

[1] [1]

Findings of the Association for Computational Linguistics: ACL 2022

THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption. Findings of the Association for Computational Linguistics: ACL 2022. 2022.https://aclanthology.org/ 2022.findings-acl.277/

work page 2022

[2] [2]

Proceedings of the 63rd Annual Meeting of the Association for Compu- tational Linguistics

Powerformer: Efficient and High-Accuracy Privacy-Preserving Language Model with Homo- morphic Encryption. Proceedings of the 63rd Annual Meeting of the Association for Compu- tational Linguistics. 2025.https://aclanthology.org/2025.acl-long.543/

work page 2025

[3] [3]

Proceedings of the 42nd International Conference on Machine Learning

EncryptedLLM: Privacy-Preserving Large Language Model Inference via GPU-Accelerated Fully Homomorphic Encryption. Proceedings of the 42nd International Conference on Machine Learning. 2025.https://proceedings.mlr.press/v267/de-castro25a.html

work page 2025

[4] [4]

ICLR 2026 Poster

MOAI: Module-Optimizing Architecture for Non-Interactive Secure Transformer Inference. ICLR 2026 Poster. 2026.https://openreview.net/forum?id=qJn4HtTzhH

work page 2026

[5] [5]

2026.https://arxiv.org/abs/2602.11470

Cachemir: Fully Homomorphic Encrypted Inference of Generative Large Language Model with KV Cache. 2026.https://arxiv.org/abs/2602.11470

work page arXiv 2026

[6] [6]

AEGIS: Scaling Long-Sequence Homomorphic Encrypted Transformer Inference via Hybrid Parallelism on Multi-GPU Systems

AEGIS: Scaling Long-Sequence Homomorphic Encrypted Transformer Inference via Hybrid Parallelism on Multi-GPU Systems. Accepted at ICS 2026. 2026.https://arxiv.org/abs/ 2604.03425

work page internal anchor Pith review Pith/arXiv arXiv 2026

[7] [7]

IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2026, Sydney, Australia, January 31 - Feb

PriTran: Privacy-Preserving Inference for Transformer-Based Language Models under Fully Homomorphic Encryption. IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2026, Sydney, Australia, January 31 - Feb. 4, 2026. 2026.https: //doi.org/10.1109/CGO68049.2026.11395232. 18

work page doi:10.1109/cgo68049.2026.11395232 2026

[8] [8]

2023.https://arxiv.org/abs/2308.05629

Inhibitor Transformers and Gated RNNs for Torus Efficient Fully Homomorphic Encryption. 2023.https://arxiv.org/abs/2308.05629

work page arXiv 2023

[9] [9]

ICLR 2022

Efficiently Modeling Long Sequences with Structured State Spaces. ICLR 2022. 2022.https: //openreview.net/forum?id=uYLFoz1vlAC

work page 2022

[10] [10]

NeurIPS 2022

Diagonal State Spaces are as Effective as Structured State Spaces. NeurIPS 2022. 2022.https: //openreview.net/forum?id=RjS0j6tsSrf

work page 2022

[11] [11]

ICLR 2023

Simplified State Space Layers for Sequence Modeling. ICLR 2023. 2023.https://openreview. net/forum?id=Ai8Hw3AXqks

work page 2023

[12] [12]

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Mamba: Linear-Time Sequence Modeling with Selective State Spaces. 2023.https://arxiv. org/abs/2312.00752

work page internal anchor Pith review Pith/arXiv arXiv 2023

[13] [13]

International Conference on Machine Learning

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Struc- tured State Space Duality. International Conference on Machine Learning. 2024.https: //proceedings.mlr.press/

work page 2024

[14] [14]

Applied Soft Computing

MPCMamba: Privacy-preserving inference for Mamba models via secure multi-party compu- tation. Applied Soft Computing. 2025.https://www.sciencedirect.com/science/article/ pii/S156849462501172X. 19

work page 2025