International conference on machine learning , pages=

Smoothquant: Accurate, efficient post-training quantization for large language models , author= · 2023

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

browse 5 citing papers

representative citing papers

Knowledge Beyond Language: Bridging the Gap in Multilingual Machine Unlearning Evaluation

cs.CL · 2026-05-14 · unverdicted · novelty 7.0

New metrics KSS and KPS are introduced to evaluate multilingual machine unlearning quality and cross-language consistency in LLMs, addressing limitations of single-language evaluation protocols.

Search Your Block Floating Point Scales!

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

ScaleSearch optimizes block floating point scales via fine-grained search to cut quantization error by 27% for NVFP4, improving PTQ by up to 15 points on MATH500 for Qwen3-8B and attention PPL by 0.77 on Llama 3.1 70B.

Sharpness-Aware Pretraining Mitigates Catastrophic Forgetting

cs.LG · 2026-05-04 · unverdicted · novelty 6.0

Sharpness-aware pretraining and related flat-minima interventions reduce catastrophic forgetting by up to 80% after post-training across 20M-150M models and by 31-40% at 1B scale.

DASH-KV: Accelerating Long-Context LLM Inference via Asymmetric KV Cache Hashing

cs.CL · 2026-04-21 · unverdicted · novelty 6.0

DASH-KV accelerates long-context LLM inference to linear complexity via asymmetric KV cache hashing and mixed-precision retention, matching full attention performance on LongBench.

GAMMA: Global Bit Allocation for Mixed-Precision Models under Arbitrary Budgets

cs.LG · 2026-05-18 · unverdicted · novelty 5.0

GAMMA is a post-training framework that learns stable module sensitivity rankings for mixed-precision LLM quantization and projects them to exact bit budgets via integer programming, enabling reuse across arbitrary memory targets.

citing papers explorer

Showing 5 of 5 citing papers.

Knowledge Beyond Language: Bridging the Gap in Multilingual Machine Unlearning Evaluation cs.CL · 2026-05-14 · unverdicted · none · ref 36
New metrics KSS and KPS are introduced to evaluate multilingual machine unlearning quality and cross-language consistency in LLMs, addressing limitations of single-language evaluation protocols.
Search Your Block Floating Point Scales! cs.LG · 2026-05-12 · unverdicted · none · ref 52
ScaleSearch optimizes block floating point scales via fine-grained search to cut quantization error by 27% for NVFP4, improving PTQ by up to 15 points on MATH500 for Qwen3-8B and attention PPL by 0.77 on Llama 3.1 70B.
Sharpness-Aware Pretraining Mitigates Catastrophic Forgetting cs.LG · 2026-05-04 · unverdicted · none · ref 81
Sharpness-aware pretraining and related flat-minima interventions reduce catastrophic forgetting by up to 80% after post-training across 20M-150M models and by 31-40% at 1B scale.
DASH-KV: Accelerating Long-Context LLM Inference via Asymmetric KV Cache Hashing cs.CL · 2026-04-21 · unverdicted · none · ref 44
DASH-KV accelerates long-context LLM inference to linear complexity via asymmetric KV cache hashing and mixed-precision retention, matching full attention performance on LongBench.
GAMMA: Global Bit Allocation for Mixed-Precision Models under Arbitrary Budgets cs.LG · 2026-05-18 · unverdicted · none · ref 27
GAMMA is a post-training framework that learns stable module sensitivity rankings for mixed-precision LLM quantization and projects them to exact bit budgets via integer programming, enabling reuse across arbitrary memory targets.

International conference on machine learning , pages=

fields

years

verdicts

representative citing papers

citing papers explorer