archive

Every paper Pith has read. Search by title, abstract, or pith.

89 papers in cs.OS · page 1

cs.OS 2026-05-21 reviewed

DeltaBox cuts AI agent checkpoint and rollback to 14 ms and 5 ms
DeltaBox: Scaling Stateful AI Agents with Millisecond-Level Sandbox Checkpoint/Rollback

Yunpeng Dong +9
cs.OS 2026-05-20 reviewed

ParaCell cuts secure container latency up to 88%
ParaCell: Paravirtualized Secure Containers with Lightweight Intra-Container Isolation and Intent-Driven Memory Management

Yiyang Wu +3
cs.OS 2026-05-19 reviewed

Managed runtime extension cuts CXL slowdown by 22-84%
Clove: Object-Level CXL Memory Management in Managed Runtimes

Sam Son +4
cs.OS 2026-05-19 reviewed

SSV lifts LLM inference speed up to 3.49x
SSV: Sparse Speculative Verification for Efficient LLM Inference

Zhibin Wang +5
cs.OS 2026-05-19 reviewed

SpecSA resolves mismatch to speed long-context LLM inference
SSV: Sparse Speculative Verification for Efficient LLM Inference

Zhibin Wang +5
cs.OS 2026-05-19 reviewed

C2CServe cuts LLM cold-start latency up to 7.1x on GH200
C2CServe: Leveraging NVLink-C2C for Elastic Serverless LLM Serving on MIG

Shutian Luo +6
cs.OS 2026-05-18 reviewed

LLM semantics from names predict load phases and cut storage overloads 79%
TIDAL: Recovering Temporal Phase for Cloud Block Storage Placement from LLM-Derived Semantics

Difan Tan +4
cs.OS 2026-05-18 reviewed

Vector search cuts SSD reads by verifying attributes after retrieval
PipeANN-Filter: An Efficient Filtered Vector Search System on SSD

Hao Guo +2
cs.OS 2026-05-17 reviewed

TClone forks live GUI workspaces at low latency for agents
TClone: Low-Latency Forking of Live GUI Environments for Computer-Use Agents

Yutong Huang +4
cs.AI 2026-05-15 reviewed

Skim cuts web-agent cost 1.9x and latency 33% with no accuracy loss
Skim: Speculative Execution for Fast and Efficient Web Agents

Mike Wong +3
cs.OS 2026-05-14 reviewed

LLM tunes Linux knobs for 72 percent stable gain over defaults
SemaTune: Semantic-Aware Online OS Tuning with Large Language Models

Georgios Liargkovas +3
cs.SE 2026-05-12 reviewed

Harness design stabilizes small language models at 95 percent success
It's Not the Size: Harness Design Determines Operational Stability in Small Language Models

Yong-eun Cho
cs.AR 2026-05-10 reviewed

KV-cache movement regularization cuts static-graph LLM latency spikes
KV-RM: Regularizing KV-Cache Movement for Static-Graph LLM Serving

Zhiqing Zhong +5
cs.CR 2026-05-07 reviewed

Virtualization hardware isolates Linux kernel parts with no code changes
Pomegranate: A Lightweight Compartmentalization Architecture using Virtualization Extensions

Shriram Raja +2
cs.SE 2026-05-06 reviewed

Case study maps SIL rules and memory limits in real car software
Shedding Light onto Safety Integrity Level and Basic Software Constraints in a Real-World Automotive Application: Case Study with Driverator Framework

Tobias Denzinger (CARIAD SE) +2
cs.OS 2026-05-05 reviewed

Pub/sub smart pointer limits reference updates to 0-1 per subscriber
ipc_shared_ptr: A Publish/Subscribe-Aware Smart Pointer for Cross-Process Object Lifetime Management

Takahiro Ishikawa-Aso +4
cs.OS 2026-05-05 reviewed

GPU-centric store makes SSD KV cache match DRAM speed
Tutti: Making SSD-Backed KV Cache Practical for Long-Context LLM Serving

Shi Qiu +8
cs.OS 2026-05-04 reviewed

Three-tier API governs urban sensor data with privacy tiers
CityOS: Privacy Architecture for Urban Sensing

Giorgio Cavicchioli +7
cs.DC 2026-05-02 reviewed

CvxCluster uses a two-stage convex optimization approach to allocate resources across…
CvxCluster: Solving Large, Complex, Granular Resource Allocation Problems 100-1000x Faster

Obi Nnorom Jr +2
cs.OS 2026-05-02 reviewed

VUDA delivers 85% higher throughput via CUDA-Vulkan spatial sharing
VUDA: Breaking CUDA-Vulkan Isolation for Spatial Sharing of Compute and Graphics on the Same GPU

Bin Xu +4
cs.DC 2026-05-01 reviewed

Workflow scheduling cuts AI agent task time by 1.64x
SAGA: Workflow-Atomic Scheduling for AI Agent Inference on GPU Clusters

Dongxin Guo +2
cs.OS 2026-04-30 reviewed

Agent sandboxes hit 100% recovery correctness at 87% less traffic
Crab: A Semantics-Aware Checkpoint/Restore Runtime for Agent Sandboxes

Tianyuan Wu +4
cs.OS 2026-04-30 reviewed

Affinity hints give 12% throughput boost on chiplet servers
Affinity Tailor: Dynamic Locality-Aware Scheduling at Scale

Jin Xin Ng +9
cs.OS 2026-04-30 reviewed

WebAssembly capsules run updatable code on tiny microcontrollers
treVM: Tiny Rust Embedded Virtual Machines with WASM on Variable Resource-Constrained Hardware

Antoine Lavandier +3
cs.OS 2026-04-28 reviewed

Rust matches C performance for industrial microcontroller firmware
Embedded Rust or C Firmware? Lessons from an Industrial Microcontroller Use Case with Ariel OS

Bipin Thapa +6
cs.OS 2026-04-28 reviewed

Rust matches C on microcontroller firmware size and speed
Embedded Rust or C Firmware? Lessons from an Industrial Microcontroller Use Case with Ariel OS

Bipin Thapa +6
cs.NI 2026-04-24 reviewed

Tenant protocols match fixed-stack speed with isolation
Chamelio: A Fast Shared Cloud Network Stack for Isolated Tenant-Defined Protocols

Matheus Stolet +2
cs.DC 2026-04-21 reviewed

Local cost signal lifts satellite goodput 20% and throughput 31%
Equinox: Decentralized Scheduling for Hardware-Aware Orbital Intelligence

Ansel Kaplan Erol +1
cs.CR 2026-04-21 reviewed

GAAP enforces user data permissions for AI agents deterministically
An AI Agent Execution Environment to Safeguard User Data

Robert Stanley +4
cs.DC 2026-04-21 reviewed

CXL single-copy cache yields 5.6X geo-mean speedup
DPC: A Distributed Page Cache over CXL

Shai Bergman +6
eess.SY 2026-04-21 reviewed

PREEMPT_RT cuts UAV control latency by 88 percent on Raspberry Pi 5
Scheduling Analysis of UAV Flight Control Workloads on PREEMPT_RT Linux Using a Raspberry Pi 5

Luiz Giacomossi +4
cs.CR 2026-04-20 reviewed

Confidential VMs run LLM agents securely on edge devices
AgenTEE: Confidential LLM Agent Execution on Edge Devices

Sina Abdollahi +7
cs.OS 2026-04-20 reviewed

Processes and pipes made lightweight for far memory accelerators
Proxics: an efficient programming model for far memory accelerators

Zikai Liu +5
cs.DC 2026-04-20 reviewed

Persistent GPU kernel yields 15x speedup for tiny tensor operations
GPUOS: A GPU Operating System Primitive for Transparent Operation Fusion

Yiwei Yang +5
cs.CR 2026-04-18 reviewed

Kernel gateway blocks AI tool-call bypasses
Governed MCP: Kernel-Level Tool Governance for AI Agents via Logit-Based Safety Primitives

Daeyeon Son
cs.OS 2026-04-15 reviewed

Filesystem lets AI agents self-correct file mistakes
Don't Let AI Agents YOLO Your Files: Shifting Information and Control to Filesystems for Agent Safety and Autonomy

Shawn Wanxiang Zhong +5
cs.OS 2026-04-14 reviewed

eBPF hooks decide page moves in tiered memory for up to 17% higher throughput
TierBPF: Page Migration Admission Control for Tiered Memory via eBPF

Xi Wang +5
cs.OS 2026-04-14 reviewed

MARS cuts agentic latency by 5.94x via co-scheduling
MARS: Efficient, Adaptive Co-Scheduling for Heterogeneous Agentic Systems

Yifei Wang +10
cs.DC 2026-04-14 reviewed

Periodic framework organizes distributed computing
A Periodic Space of Distributed Computing: Vision & Framework

Mohsen Amini Salehi +7
cs.LG 2026-04-14 reviewed

Physics-informed DLinear forecasts AI data center power more accurately
A Physics-Aware Framework for Short-Term GPU Power Forecasting of AI Data Centers

Mohammad AlShaikh Saleh +4
cs.OS 2026-04-14 reviewed

Hybrid tuning raises tiered memory performance up to 30%
Hybrid Adaptive Tuning for Tiered Memory Systems

Xi Wang +5
cs.OS 2026-04-13 reviewed

Kernel reads one logit to classify AI agent actions
ProbeLogits: Kernel-Level LLM Inference Primitives for AI-Native Operating Systems

Daeyeon Son
cs.OS 2026-04-13 reviewed

Nanvix cuts serverless server needs by 20-100x
Nanvix: A Multikernel OS Design for High-Density Serverless Deployments

Carlos Segarra +6
cs.AI 2026-04-11 reviewed

ClawVM makes LLM agent state residency deterministic
ClawVM: Harness-Managed Virtual Memory for Stateful Tool-Using LLM Agents

Mofasshara Rafique +1
cs.DB 2026-04-10 reviewed

Decoupling vectors from indexes cuts storage by up to 59%
Decoupling Vector Data and Index Storage for Space Efficiency

Yuanming Ren +5
cs.DB 2026-04-10 reviewed

Decoupling vectors from indexes cuts storage by up to 58%
Decoupling Vector Data and Index Storage for Space Efficiency

Yuanming Ren +5
cs.OS 2026-04-10 reviewed

Adaptive quantization cuts mobile LLM cold starts by 4x
EdgeFlow: Fast Cold Starts for LLMs on Mobile Devices

Yongsheng Yan +3
cs.GT 2026-04-09 reviewed

Game orchestrator finds 2.7x more kernel vulnerabilities per budget
VCAO: Verifier-Centered Agentic Orchestration for Strategic OS Vulnerability Discovery

Suyash Mishra
cs.OS 2026-04-09 reviewed

Valve saves 2,170 GPUs by colocating online and offline inference
Valve: Production Online-Offline Inference Colocation with Jointly-Bounded Preemption Latency and Rate

Fangyue Liu +10
cs.CR 2026-04-09 reviewed

Hardware middleware cuts device onboarding latency by 65%
A Hardware-Anchored Privacy Middleware for PII Sharing Across Heterogeneous Embedded Consumer Devices

Aditya Sabbineni +4