SafeTune: Mitigating Data Poisoning in LLM Fine-Tuning for RTL Code Generation

· 2026 · cs.CR · arXiv 2604.27238

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

As large language models (LLMs) are increasingly fine-tuned for hardware tasks like RTL code generation, the scarcity of high-quality datasets often leads to the use of rapidly assembled or generated training data. These datasets frequently lack security verification and are highly susceptible to data poisoning attacks. Such poisoning can cause models to generate syntactically valid but insecure hardware modules that bypass standard functionality checks. To address this, we present SafeTune, a framework designed to harden LLM-based RTL generation against poisoning, specifically focusing on hardware Trojan (HT) insertion. SafeTune integrates two core components: (i) a Graph Neural Network (GNN) that models structural properties to identify anomalous circuitry patterns during fine-tuning, and (ii) a semantic verification module using text embeddings and an XGBoost classifier to assess prompt security. By coupling structural and semantic knowledge, SafeTune effectively filters poisoned inputs without sacrificing legitimate data. Experimental results demonstrate that SafeTune significantly enhances the robustness and reliability of LLM fine-tuning without requiring modifications to the underlying model architecture.

representative citing papers

CASS-RTL: Correctness-Aware Subspace Steering for RTL Generation with LLMs

cs.PL · 2026-06-04 · unverdicted · novelty 6.0

CASS-RTL identifies correctness-linked attention heads, builds a steering subspace from them, and applies a geometry-aware intervention that raises pass@1/5/10 accuracy 10-20% on VerilogEval and 5% on CVDP across multiple LLMs without retraining or extra labels.

citing papers explorer

Showing 1 of 1 citing paper.

CASS-RTL: Correctness-Aware Subspace Steering for RTL Generation with LLMs cs.PL · 2026-06-04 · unverdicted · none · ref 18 · internal anchor
CASS-RTL identifies correctness-linked attention heads, builds a steering subspace from them, and applies a geometry-aware intervention that raises pass@1/5/10 accuracy 10-20% on VerilogEval and 5% on CVDP across multiple LLMs without retraining or extra labels.

SafeTune: Mitigating Data Poisoning in LLM Fine-Tuning for RTL Code Generation

fields

years

verdicts

representative citing papers

citing papers explorer