Tail-Aware HiFloat4: W4A4 Post-Training Quantization for Wan2.2

Long Peng; Shuai Guo; Xin Di; Yang Cao; Zhanfeng Feng; Zhengjun Zha

arxiv: 2605.26628 · v1 · pith:KGJXFIWInew · submitted 2026-05-26 · 💻 cs.AI

Tail-Aware HiFloat4: W4A4 Post-Training Quantization for Wan2.2

Zhanfeng Feng , Shuai Guo , Xin Di , Long Peng , Yang Cao , Zhengjun Zha This is my paper

classification 💻 cs.AI

keywords hifloat4quantizationwan2calibrationmodulespipelinepost-trainingtail-aware

0 comments

read the original abstract

This report describes Tail-Aware HiFloat4, our submission to the low-bit text-to-video generation quantization challenge. Our method adapts the public ViDiT-Q post-training quantization pipeline to Wan2.2 under the HiFloat4 numerical format. We quantize the main linear layers in both Wan2.2 transformer modules with W4A4 HiFloat4 fake quantization, keep numerically sensitive boundary modules in high precision, and introduce an activation-tail-aware percentile calibration module for channel-mask construction. Together with compact PTQ-state restoration, this design reduces the influence of rare calibration outliers while keeping the runtime HiFloat4 arithmetic and sampling pipeline unchanged.

This paper has not been read by Pith yet.

Tail-Aware HiFloat4: W4A4 Post-Training Quantization for Wan2.2

discussion (0)