Li et al.Efficiency Meets Fidelity: A Novel Quantization Framework for Stable Diffusion

· 2025 · arXiv 2412.06661

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Holding the FP8 Quality Ceiling at 8-Bit Weights and Activations: INT8 and GGUF Post-Training Quantization of Ideogram 4.0 for Consumer GPUs

cs.LG · 2026-06-10 · unverdicted · novelty 4.0

INT8 W8A8 post-training quantization of Ideogram 4.0 preserves FP8 quality on a 200-prompt benchmark while outperforming NF4 on CLIP score and offering a favorable quality-memory trade-off via GGUF Q4_K.

Smaller Models, Unexpected Costs: Trade-offs in LLM Quantization for Automated Program Repair

cs.SE · 2026-06-25 · unverdicted · novelty 3.0

Empirical evaluation of 13 quantization configurations on 6 LLMs for APR shows reduced memory (up to 85%) but increased inference time/energy, different repaired problem sets with little overlap, and 48% of configs strictly dominated.

citing papers explorer

Showing 2 of 2 citing papers.

Holding the FP8 Quality Ceiling at 8-Bit Weights and Activations: INT8 and GGUF Post-Training Quantization of Ideogram 4.0 for Consumer GPUs cs.LG · 2026-06-10 · unverdicted · none · ref 11
INT8 W8A8 post-training quantization of Ideogram 4.0 preserves FP8 quality on a 200-prompt benchmark while outperforming NF4 on CLIP score and offering a favorable quality-memory trade-off via GGUF Q4_K.
Smaller Models, Unexpected Costs: Trade-offs in LLM Quantization for Automated Program Repair cs.SE · 2026-06-25 · unverdicted · none · ref 28
Empirical evaluation of 13 quantization configurations on 6 LLMs for APR shows reduced memory (up to 85%) but increased inference time/energy, different repaired problem sets with little overlap, and 48% of configs strictly dominated.

Li et al.Efficiency Meets Fidelity: A Novel Quantization Framework for Stable Diffusion

fields

years

verdicts

representative citing papers

citing papers explorer