{"paper":{"title":"A Simple and Effective Pruning Approach for Large Language Models","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"Wanda prunes large language models by removing weights whose magnitudes times input activations are smallest, with no retraining required.","cross_cats":["cs.AI","cs.LG"],"primary_cat":"cs.CL","authors_text":"Anna Bair, J. Zico Kolter, Mingjie Sun, Zhuang Liu","submitted_at":"2023-06-20T17:18:20Z","abstract_excerpt":"As their size increases, Large Languages Models (LLMs) are natural candidates for network pruning methods: approaches that drop a subset of network weights while striving to preserve performance. Existing methods, however, require either retraining, which is rarely affordable for billion-scale LLMs, or solving a weight reconstruction problem reliant on second-order information, which may also be computationally expensive. In this paper, we introduce a novel, straightforward yet effective pruning method, termed Wanda (Pruning by Weights and activations), designed to induce sparsity in pretraine"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Wanda significantly outperforms the established baseline of magnitude pruning and performs competitively against recent methods involving intensive weight update on LLaMA and LLaMA-2 across various language benchmarks, with no retraining or weight update required.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The central assumption is that the product of weight magnitude and input activation reliably identifies weights whose removal will least affect model performance, motivated by emergent large-magnitude features but without a formal proof that this criterion is optimal or generalizes beyond the tested models.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Wanda prunes pretrained LLMs by dropping weights with smallest magnitude times activation values per output channel, with no retraining needed and better results than magnitude pruning on language benchmarks.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Wanda prunes large language models by removing weights whose magnitudes times input activations are smallest, with no retraining required.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"71016ab615440db7754d7869f62659141955fc8c09bbc5ce2165b455abfe444f"},"source":{"id":"2306.11695","kind":"arxiv","version":3},"verdict":{"id":"7397a1c0-a27e-43f8-ac96-b7dd12a0aad6","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-16T16:08:25.620116Z","strongest_claim":"Wanda significantly outperforms the established baseline of magnitude pruning and performs competitively against recent methods involving intensive weight update on LLaMA and LLaMA-2 across various language benchmarks, with no retraining or weight update required.","one_line_summary":"Wanda prunes pretrained LLMs by dropping weights with smallest magnitude times activation values per output channel, with no retraining needed and better results than magnitude pruning on language benchmarks.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The central assumption is that the product of weight magnitude and input activation reliably identifies weights whose removal will least affect model performance, motivated by emergent large-magnitude features but without a formal proof that this criterion is optimal or generalizes beyond the tested models.","pith_extraction_headline":"Wanda prunes large language models by removing weights whose magnitudes times input activations are smallest, with no retraining required."},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":2,"snapshot_sha256":"068cdd3ac91ece8e10f3c596f5b95a706ee3ae5f90a5009f3368649bfbbe5656"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}