Title resolution pending

Joel Coburn, Chunqiang Tang, Sameer Abu Asal, Neeraj Agrawal, Raviteja Chinta, Harish Dixit, Brian Dodds, Saritha Dwarakapuram, Amin Firoozshahian, Cao Gao, Kaustubh Gondkar, Tyler Graf, Junhan Hu, Jian Huang, Sterling Hughes, Adam Hutchin · 2025 · arXiv 5053.373140

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Bifrost: Hybrid TEE-FHE Inference for Privacy-Preserving Transformer and LLM Serving

cs.CR · 2026-06-16 · unverdicted · novelty 7.0 · 2 refs

Bifrost achieves significant latency reductions in privacy-preserving transformer inference through a hybrid CPU TEE and accelerator FHE design, with Bifrost+ further optimizing via prefill/decode split.

WHET: Welding Homomorphic Encryption to Accelerator Architectures

cs.CR · 2026-06-10 · unverdicted · novelty 6.0 · 2 refs

WHET applies fine-grained coefficient-to-slot transforms, plaintext compression, and modulus raising plus lightweight hardware tweaks to FHE accelerators, delivering 1.38-8.74x per-area gains and sub-millisecond CKKS bootstrapping.

SpenseGPT: Practical One-shot Pruning Enabling Sparse and Dense GEMMs for LLM Inference

cs.LG · 2026-06-09 · unverdicted · novelty 6.0

SpenseGPT introduces a hybrid sparse-dense weight format and one-shot pruning that delivers 1.2x end-to-end LLM decoding speedup on B200 GPUs with FP8 while preserving accuracy on Qwen3-32B and Seed-OSS-36B.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Bifrost: Hybrid TEE-FHE Inference for Privacy-Preserving Transformer and LLM Serving cs.CR · 2026-06-16 · unverdicted · none · ref 15 · 2 links
Bifrost achieves significant latency reductions in privacy-preserving transformer inference through a hybrid CPU TEE and accelerator FHE design, with Bifrost+ further optimizing via prefill/decode split.
WHET: Welding Homomorphic Encryption to Accelerator Architectures cs.CR · 2026-06-10 · unverdicted · none · ref 39 · 2 links
WHET applies fine-grained coefficient-to-slot transforms, plaintext compression, and modulus raising plus lightweight hardware tweaks to FHE accelerators, delivering 1.38-8.74x per-area gains and sub-millisecond CKKS bootstrapping.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer