INT8 W8A8 post-training quantization of Ideogram 4.0 preserves FP8 quality on a 200-prompt benchmark while outperforming NF4 on CLIP score and offering a favorable quality-memory trade-off via GGUF Q4_K.
Li et al.Efficiency Meets Fidelity: A Novel Quantization Framework for Stable Diffusion
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Empirical evaluation of 13 quantization configurations on 6 LLMs for APR shows reduced memory (up to 85%) but increased inference time/energy, different repaired problem sets with little overlap, and 48% of configs strictly dominated.
citing papers explorer
-
Holding the FP8 Quality Ceiling at 8-Bit Weights and Activations: INT8 and GGUF Post-Training Quantization of Ideogram 4.0 for Consumer GPUs
INT8 W8A8 post-training quantization of Ideogram 4.0 preserves FP8 quality on a 200-prompt benchmark while outperforming NF4 on CLIP score and offering a favorable quality-memory trade-off via GGUF Q4_K.
-
Smaller Models, Unexpected Costs: Trade-offs in LLM Quantization for Automated Program Repair
Empirical evaluation of 13 quantization configurations on 6 LLMs for APR shows reduced memory (up to 85%) but increased inference time/energy, different repaired problem sets with little overlap, and 48% of configs strictly dominated.