FlexServe achieves up to 10x faster time-to-first-token for secure LLM inference on mobile devices by using flexible resource isolation in TrustZone compared to standard approaches.
The ai companion who cares always here to listen and talk.https://replika.ai/, 2026
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
FlexServe: A Fast and Secure LLM Serving System for Mobile Devices with Flexible Resource Isolation
FlexServe achieves up to 10x faster time-to-first-token for secure LLM inference on mobile devices by using flexible resource isolation in TrustZone compared to standard approaches.