{"paper":{"title":"SemaTune: Semantic-Aware Online OS Tuning with Large Language Models","license":"http://creativecommons.org/licenses/by/4.0/","headline":"SemaTune uses language models to reason over OS knob meanings and history, delivering 72.5 percent better stable performance than defaults across 13 workloads.","cross_cats":["cs.AI","cs.PF"],"primary_cat":"cs.OS","authors_text":"Georgios Liargkovas, Hubertus Franke, Kostis Kaffes, Mihir Nitin Joshi","submitted_at":"2026-05-14T16:25:32Z","abstract_excerpt":"Online OS tuning can improve long-running services, but existing controllers are poorly matched to live hosts. They treat scheduler, power, memory, and I/O controls as black-box variables and optimize a scalar reward. This view ignores cross-knob policy structure, breaks down when application metrics are unavailable, and can send a running service into degraded regions that persist after the bad setting is removed. We present SemaTune, a host-side framework for steady-state OS tuning with bounded language-model guidance. SemaTune turns knob schemas, telemetry, current configuration, recent act"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Across the suite, SemaTune improves stable-phase performance by 72.5% over default settings and by 153.3% relative to the strongest non-LLM baseline.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the LLM-proposed changes, after passing typed validation, will reliably improve or maintain performance without entering persistent degraded regions, particularly when relying only on host-level metrics.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"SemaTune uses LLM guidance with semantic context to tune up to 41 Linux OS parameters, delivering 72.5% performance gains over defaults and 153.3% over non-LLM baselines on 13 workloads while avoiding degraded states.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"SemaTune uses language models to reason over OS knob meanings and history, delivering 72.5 percent better stable performance than defaults across 13 workloads.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"61162f2d009bafb74525788b6229ec2f9f3730fb0f2aa939c49f0f5c4917a95b"},"source":{"id":"2605.15026","kind":"arxiv","version":1},"verdict":{"id":"bc9eca8f-9aa8-46b8-94e9-e55de716bc27","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T02:33:25.577520Z","strongest_claim":"Across the suite, SemaTune improves stable-phase performance by 72.5% over default settings and by 153.3% relative to the strongest non-LLM baseline.","one_line_summary":"SemaTune uses LLM guidance with semantic context to tune up to 41 Linux OS parameters, delivering 72.5% performance gains over defaults and 153.3% over non-LLM baselines on 13 workloads while avoiding degraded states.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the LLM-proposed changes, after passing typed validation, will reliably improve or maintain performance without entering persistent degraded regions, particularly when relying only on host-level metrics.","pith_extraction_headline":"SemaTune uses language models to reason over OS knob meanings and history, delivering 72.5 percent better stable performance than defaults across 13 workloads."},"references":{"count":93,"sample":[{"doi":"","year":2019,"title":"PhD thesis, Inria Rennes-Bretagne Atlantique, 2019","work_id":"922ffe2a-a89c-4de4-af29-4b04c41367be","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2023,"title":"Improving storage systems using machine learning.ACM Transactions on Storage, 19(1):1– 30, 2023","work_id":"3eb67aeb-e3f3-4f10-ac81-da7bec571b07","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2021,"title":"A machine learning framework to improve storage system performance","work_id":"3c4e0815-cfa9-4a19-a87b-6be1c2d1e5a3","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2020,"title":"Cose: Configuring serverless functions using statistical learning","work_id":"fbb09902-b56f-4d18-8710-0575b6511079","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2017,"title":"{CherryPick}: Adap- tively unearthing the best cloud configurations for big data analytics","work_id":"7dba20d7-db86-4d8b-9095-fb8285427e9d","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":93,"snapshot_sha256":"2cfe460f075268b452cfe7c11c8fea1506109a1a9c87485bf394283a32f4bbde","internal_anchors":4},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}