DMI-Lib delivers 0.4-6.8% overhead for offline batch LLM inference and ~6% for moderate online serving while exposing rich internal signals across backends, cutting latency overhead 2-15x versus prior observability baselines.
Precise attribute intensity control in large language models via targeted representation editing
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
Malleable Prompting reifies subjective preferences from natural language into GUI widgets and modulates LLM token probabilities during decoding to enable controllable generation, with a user study showing improved precision and perceived controllability over standard prompting.
citing papers explorer
-
Enabling Performant and Flexible Model-Internal Observability for LLM Inference
DMI-Lib delivers 0.4-6.8% overhead for offline batch LLM inference and ~6% for moderate online serving while exposing rich internal signals across backends, cutting latency overhead 2-15x versus prior observability baselines.
-
From Words to Widgets for Controllable LLM Generation
Malleable Prompting reifies subjective preferences from natural language into GUI widgets and modulates LLM token probabilities during decoding to enable controllable generation, with a user study showing improved precision and perceived controllability over standard prompting.