Investigating the impact of quantization methods on the safety and reliability of large language models

Artyom Kharinaev, Viktor Moskvoretskii, Egor Shvetsov, Kseniia Studenikina, Mikhail Bykov, Evgeny Burnaev · 2025 · arXiv 2502.15799

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 1

citation-polarity summary

support 1

representative citing papers

FLIPS: Instance-Fingerprinting for LLMs via Pseudo-random Sequences

cs.LG · 2026-06-02 · unverdicted · novelty 8.0

FLIPS identifies LLM instances with 96% closed-set and 90% open-set accuracy by exploiting biases in generated binary random sequences across 237 instances.

The Defense Trilemma: Why Prompt Injection Defense Wrappers Fail?

cs.CR · 2026-04-07 · unverdicted · novelty 8.0

No continuous utility-preserving input wrapper can eliminate all prompt injection risks in connected prompt spaces for language models.

Quality Is Not a Safety Proxy Under Quantization

cs.LG · 2026-06-08 · conditional · novelty 6.0

Across 51 quantized checkpoints, quality metrics fail to predict safety drops in 36 pairings and 10 hidden-danger cases, while a new RTSI screen routes all 10 dangerous rows to testing at matched bucket size.

Quantization Undoes Alignment: Bias Emergence in Compressed LLMs Across Models and Precision Levels

cs.LG · 2026-05-02 · conditional · novelty 6.0

3-bit quantization induces new stereotypical biases in 6-21% of previously unbiased BBQ items across three LLMs, undetected by perplexity increases under 3%, with models declining in 'unknown' responses by 17.4%.

Are Large Language Models Economically Viable for Industry Deployment?

cs.CL · 2026-04-21 · unverdicted · novelty 6.0

Small LLMs under 2B parameters achieve better economic break-even, energy efficiency, and hardware density than larger models on legacy GPUs for industrial tasks.

Weight Pruning Amplifies Bias: A Multi-Method Study of Compressed LLMs for Edge AI

cs.LG · 2026-05-02 · conditional · novelty 5.0

Activation-aware pruning preserves perplexity but amplifies bias in LLMs, with 47-59% of previously neutral items developing new stereotypical responses at 70% sparsity.

From 2:4 to 8:16 sparsity patterns in LLMs for Outliers and Weights with Variance Correction

cs.LG · 2025-07-03 · unverdicted · novelty 5.0

8:16 sparsity with variance correction and outlier handling lets compressed LLMs match or exceed dense-model accuracy under fixed memory limits, outperforming the common 2:4 pattern in flexibility.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Quality Is Not a Safety Proxy Under Quantization cs.LG · 2026-06-08 · conditional · none · ref 19
Across 51 quantized checkpoints, quality metrics fail to predict safety drops in 36 pairings and 10 hidden-danger cases, while a new RTSI screen routes all 10 dangerous rows to testing at matched bucket size.
Quantization Undoes Alignment: Bias Emergence in Compressed LLMs Across Models and Precision Levels cs.LG · 2026-05-02 · conditional · none · ref 18
3-bit quantization induces new stereotypical biases in 6-21% of previously unbiased BBQ items across three LLMs, undetected by perplexity increases under 3%, with models declining in 'unknown' responses by 17.4%.
Weight Pruning Amplifies Bias: A Multi-Method Study of Compressed LLMs for Edge AI cs.LG · 2026-05-02 · conditional · none · ref 31
Activation-aware pruning preserves perplexity but amplifies bias in LLMs, with 47-59% of previously neutral items developing new stereotypical responses at 70% sparsity.

Investigating the impact of quantization methods on the safety and reliability of large language models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer