AHCQ-SAM: Toward Accurate and Hardware-Compatible Post-Training Segment Anything Model Quantization
Pith reviewed 2026-05-23 01:43 UTC · model grok-4.3
The pith
AHCQ-SAM applies four targeted techniques to overcome specific quantization barriers in SAM, enabling higher-accuracy 4-bit models for segmentation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that the four challenges—ill-conditioned weights, skewed post-GELU activations, inter-channel variance in linear layers, and heterogeneous attention scores—limit existing PTQ for SAM, and that the proposed Activation-aware Condition Number Reduction, Hybrid Log-Uniform Quantization, Channel-Aware Grouping, and Logarithmic Nonlinear Quantization together remove those limits, producing 15.2 percent higher mAP on COCO for 4-bit SAM-B and 14.01 percent higher J&F on SA-V for 4-bit SAM2-Tiny, plus measured FPGA speed and power gains.
What carries the argument
The AHCQ-SAM PTQ framework built from four synergistic components that each target one listed quantization challenge in SAM's architecture.
If this is right
- 4-bit SAM-B with Faster R-CNN reaches 15.2 percent higher mAP on COCO than the previous best PTQ method.
- 4-bit SAM2-Tiny reaches 14.01 percent higher J&F on the SA-V Test set than prior methods.
- An FPGA implementation delivers 7.12 times speedup and 6.62 times better power efficiency versus the floating-point baseline.
- The work supplies the first reported PTQ benchmark numbers for SAM2 models.
Where Pith is reading between the lines
- The same component design could be tested on other large vision transformers that share similar activation and attention patterns.
- Direct integration of the channel-grouping and log-scale rules into hardware accelerators might reduce memory traffic further.
- If the accuracy holds at 4 bits, on-device zero-shot segmentation becomes feasible on mobile or embedded processors without retraining.
Load-bearing premise
The four listed quantization challenges are the dominant performance limiters for existing PTQ on SAM and that the four proposed components mitigate them without introducing offsetting accuracy or hardware costs.
What would settle it
Measuring 4-bit SAM-B with Faster R-CNN on the COCO dataset and finding that mAP does not exceed the prior SOTA method by a clear margin would falsify the accuracy claim.
Figures
read the original abstract
The Segment Anything Model (SAM) has revolutionized image and video segmentation with its powerful zero-shot capabilities. However, its massive parameter scale and high computational demands hinder efficient deployment on resource-constrained edge devices. While Post-Training Quantization (PTQ) offers a practical solution, existing methods still fail to handle four critical quantization challenges: (1) ill-conditioned weights; (2) skewed and long-tailed post-GELU activations; (3) pronounced inter-channel variance in linear projections; and (4) exponentially scaled and heterogeneous attention scores. To mitigate these bottlenecks, we propose AHCQ-SAM, an accurate and hardware-compatible PTQ framework featuring four synergistic components: (1) Activation-aware Condition Number Reduction (ACNR), which regularizes weight matrices via a proximal point algorithm to suppress ill-conditioning; (2) Hybrid Log-Uniform Quantization (HLUQ), which combines power-of-two and uniform quantizers to capture skewed post-GELU activations; (3) Channel-Aware Grouping (CAG), which clusters channels with homogeneous statistics to achieve high accuracy with minimal hardware overhead; and (4) Logarithmic Nonlinear Quantization (LNQ), which utilizes logarithmic transformations to adaptively adjust quantization resolution for exponential and heterogeneous attention scores. Experimental results demonstrate that AHCQ-SAM outperforms current methods on SAM. Compared with the SOTA method, it achieves a 15.2% improvement in mAP for 4-bit SAM-B with Faster R-CNN on the COCO dataset. Furthermore, we establish a PTQ benchmark for SAM2, where AHCQ-SAM yields a 14.01% improvement in J&F for 4-bit SAM2-Tiny on the SA-V Test dataset. Finally, FPGA-based implementation validates the practical utility of AHCQ-SAM, delivering a 7.12x speedup and a 6.62x power efficiency improvement over the floating-point baseline.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents AHCQ-SAM, a post-training quantization (PTQ) framework for SAM and SAM2 that identifies four quantization challenges (ill-conditioned weights, skewed post-GELU activations, inter-channel variance, heterogeneous attention scores) and proposes four components (ACNR via proximal point algorithm, HLUQ combining power-of-two and uniform quantizers, CAG for homogeneous channel clustering, LNQ with logarithmic transformations) to address them. It reports 15.2% mAP gain for 4-bit SAM-B on COCO with Faster R-CNN and 14.01% J&F gain for 4-bit SAM2-Tiny on SA-V Test, plus 7.12x FPGA speedup.
Significance. If substantiated, the work would advance practical PTQ for large vision transformers by targeting SAM-specific issues and establishing a SAM2 PTQ benchmark. The FPGA implementation provides a concrete hardware demonstration. Strengths include the attempt to link specific model properties (condition numbers, activation tails, attention heterogeneity) to quantization design.
major comments (3)
- [Method (ACNR)] Method section (ACNR): no pre/post condition-number diagnostics or matrix-norm measurements are supplied to verify that the proximal-point regularization actually suppresses ill-conditioning rather than merely altering the loss landscape.
- [Experiments] Experiments section: ablation tables isolating each of ACNR/HLUQ/CAG/LNQ versus a common PTQ baseline are absent, so the 15.2% mAP and 14.01% J&F gains cannot be attributed to the claimed mechanisms versus calibration-set choice or hyper-parameter search.
- [Hardware evaluation] Hardware evaluation: the single FPGA result reports 7.12x speedup and 6.62x power efficiency but provides no cycle-accurate or LUT/BRAM overhead figures for the logarithmic and grouping operations introduced by HLUQ and CAG.
minor comments (2)
- [Introduction] Abstract and §1: the four challenges are asserted as dominant without a supporting citation or preliminary measurement showing they exceed other known PTQ error sources for SAM.
- [Method (HLUQ/LNQ)] Notation in HLUQ and LNQ: the hybrid quantizer and logarithmic mapping lack explicit equations defining the scale factors and breakpoints, hindering reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment below and indicate planned revisions.
read point-by-point responses
-
Referee: [Method (ACNR)] Method section (ACNR): no pre/post condition-number diagnostics or matrix-norm measurements are supplied to verify that the proximal-point regularization actually suppresses ill-conditioning rather than merely altering the loss landscape.
Authors: We agree that direct pre- and post-ACNR condition number and matrix norm measurements would strengthen validation of the proximal point algorithm's effect. The ACNR component is motivated by the ill-conditioning analysis in Section 3.1. In the revised manuscript we will add these diagnostics for representative weight matrices to confirm suppression of ill-conditioning. revision: yes
-
Referee: [Experiments] Experiments section: ablation tables isolating each of ACNR/HLUQ/CAG/LNQ versus a common PTQ baseline are absent, so the 15.2% mAP and 14.01% J&F gains cannot be attributed to the claimed mechanisms versus calibration-set choice or hyper-parameter search.
Authors: We acknowledge that component-wise ablations are needed to isolate contributions. While overall comparisons to prior PTQ methods are provided, we will add an ablation table in the revised manuscript that evaluates each technique (ACNR, HLUQ, CAG, LNQ) individually against a shared baseline to clarify attribution of the reported gains. revision: yes
-
Referee: [Hardware evaluation] Hardware evaluation: the single FPGA result reports 7.12x speedup and 6.62x power efficiency but provides no cycle-accurate or LUT/BRAM overhead figures for the logarithmic and grouping operations introduced by HLUQ and CAG.
Authors: We recognize that detailed overhead metrics for the logarithmic and grouping operations would better demonstrate hardware compatibility. Our FPGA results report end-to-end gains; we will add available LUT/BRAM utilization data for these operations in the revision, though cycle-accurate per-operation breakdowns may be limited by our current implementation setup. revision: partial
Circularity Check
No significant circularity; empirical results are independently measured
full rationale
The paper identifies four quantization challenges and proposes four corresponding modules (ACNR, HLUQ, CAG, LNQ) to address them, then reports measured accuracy gains on COCO and SA-V datasets. No equations appear in the provided text that would define any reported prediction or improvement as equivalent to a fitted parameter or input by construction. No self-citations are invoked to justify uniqueness theorems, ansatzes, or load-bearing premises. The central claims rest on external benchmark comparisons rather than reducing to self-referential definitions or statistical forcing from the same calibration data. This qualifies as a self-contained empirical contribution with no detectable circular steps.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 1 Pith paper
-
When W4A4 Breaks Camouflaged Object Detection: Token-Group Dual-Constraint Activation Quantization
COD-TDQ uses token-group scaling and dual-constraint projection to fix 4-bit activation quantization for camouflaged object detection, delivering more than 0.12 higher Sα scores than prior methods on four benchmarks w...
Reference graph
Works this paper leans on
-
[1]
Post train- ing 4-bit quantization of convolutional networks for rapid- deployment
Ron Banner, Yury Nahshan, Daniel Soudry, et al. Post train- ing 4-bit quantization of convolutional networks for rapid- deployment. In Proceedings of the Advances in Neural In- formation Processing Systems, pages 7950–7958, 2019. 4
work page 2019
- [2]
-
[3]
Data-free network compression via parametric non-uniform mixed precision quantization
Vladimir Chikin and Mikhail Antiukh. Data-free network compression via parametric non-uniform mixed precision quantization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 450– 459, 2022. 2, 4
work page 2022
-
[4]
Towards accurate post- training quantization for vision transformer
Yifu Ding, Haotong Qin, Qinghua Yan, Zhenhua Chai, Junjie Liu, Xiaolin Wei, and Xianglong Liu. Towards accurate post- training quantization for vision transformer. In Proceedings of the 30th ACM international conference on multimedia , pages 5380–5388, 2022. 2, 4
work page 2022
-
[5]
Steven K Esser, Jeffrey L McKinstry, Deepika Bablani, Rathinakumar Appuswamy, and Dharmendra S Modha. Learned step size quantization. arXiv preprint arXiv:1902.08153, 2019. 1
-
[6]
Jumping through local minima: Quantization in the loss landscape of vision transformers
Natalia Frumkin, Dibakar Gope, and Diana Marculescu. Jumping through local minima: Quantization in the loss landscape of vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision , pages 16978–16988, 2023. 2
work page 2023
-
[7]
YOLOX: Exceeding YOLO Series in 2021
Z Ge. Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430, 2021. 6
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[8]
Differ- entiable soft quantization: Bridging full-precision and low- bit neural networks
Ruihao Gong, Xianglong Liu, Shenghu Jiang, Tianxiang Li, Peng Hu, Jiazhen Lin, Fengwei Yu, and Junjie Yan. Differ- entiable soft quantization: Bridging full-precision and low- bit neural networks. In Proceedings of the IEEE/CVF inter- national conference on computer vision , pages 4852–4861,
-
[9]
Daq: distribution-aware quantization for deep image super-resolution networks
Cheeun Hong, Heewon Kim, Junghun Oh, and Ky- oung Mu Lee. Daq: distribution-aware quantization for deep image super-resolution networks. arXiv preprint arXiv:2012.11230, 2020. 2, 4
-
[10]
1.1 computing’s energy problem (and what we can do about it)
Mark Horowitz. 1.1 computing’s energy problem (and what we can do about it). In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC) , pages 10–14, 2014. 4, 2
work page 2014
-
[11]
Multi-dimensional vi- sion transformer compression via dependency guided gaus- sian process search
Zejiang Hou and Sun-Yuan Kung. Multi-dimensional vi- sion transformer compression via dependency guided gaus- sian process search. In Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 3669–3678, 2022. 1
work page 2022
-
[12]
Ding Jia, Yuhui Yuan, Haodi He, Xiaopei Wu, Haojun Yu, Weihong Lin, Lei Sun, Chao Zhang, and Han Hu. Detrs with hybrid matching. In Proceedings of the IEEE/CVF con- ference on computer vision and pattern recognition , pages 19702–19712, 2023. 6
work page 2023
-
[13]
Berg, Wan-Yen Lo, Piotr Doll ´ar, and Ross B
Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chlo´e Rolland, Laura Gustafson, Tete Xiao, Spencer White- head, Alexander C. Berg, Wan-Yen Lo, Piotr Doll ´ar, and Ross B. Girshick. Segment anything. 2023 IEEE/CVF In- ternational Conference on Computer Vision (ICCV) , pages 3992–4003, 2023. 1, 2
work page 2023
-
[14]
Quantizing deep convolutional networks for efficient inference: A whitepaper
Raghuraman Krishnamoorthi. Quantizing deep convolu- tional networks for efficient inference: A whitepaper. arXiv preprint arXiv:1806.08342, 2018. 1
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[16]
Additive powers-of- two quantization: An efficient non-uniform discretization for neural networks
Yuhang Li, Xin Dong, and Wei Wang. Additive powers-of- two quantization: An efficient non-uniform discretization for neural networks. arXiv preprint arXiv:1909.13144, 2019. 1
-
[17]
Yuhang Li, Ruihao Gong, Xu Tan, Yang Yang, Peng Hu, Qi Zhang, Fengwei Yu, Wei Wang, and Shi Gu. Brecq: Pushing the limit of post-training quantization by block reconstruc- tion. arXiv preprint arXiv:2102.05426, 2021. 2, 6, 4, 5
-
[18]
Q-vit: Accurate and fully quantized low-bit vision transformer
Yanjing Li, Sheng Xu, Baochang Zhang, Xianbin Cao, Peng Gao, and Guodong Guo. Q-vit: Accurate and fully quantized low-bit vision transformer. Advances in neural information processing systems, 35:34451–34463, 2022. 1
work page 2022
-
[19]
I-vit: Integer-only quantization for efficient vision transformer inference
Zhikai Li and Qingyi Gu. I-vit: Integer-only quantization for efficient vision transformer inference. In Proceedings of the IEEE/CVF International Conference on Computer Vi- sion, pages 17065–17075, 2023. 1
work page 2023
-
[20]
Patch similarity aware data-free quantization for vision transformers
Zhikai Li, Liping Ma, Mengjuan Chen, Junrui Xiao, and Qingyi Gu. Patch similarity aware data-free quantization for vision transformers. In European conference on computer vision, pages 154–170. Springer, 2022. 2
work page 2022
-
[21]
Repq- vit: Scale reparameterization for post-training quantization of vision transformers
Zhikai Li, Junrui Xiao, Lianwei Yang, and Qingyi Gu. Repq- vit: Scale reparameterization for post-training quantization of vision transformers. In Proceedings of the IEEE/CVF In- ternational Conference on Computer Vision , pages 17227– 17236, 2023. 2, 4
work page 2023
-
[22]
Microsoft coco: Common objects in context
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll´ar, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014. 6
work page 2014
-
[23]
Fq-vit: Post-training quantization for fully quantized vision transformer
Yang Lin, Tianyu Zhang, Peiqin Sun, Zheng Li, and Shuchang Zhou. Fq-vit: Post-training quantization for fully quantized vision transformer. arXiv preprint arXiv:2111.13824, 2021. 2, 4
-
[24]
Pd-quant: Post-training quantiza- tion based on prediction difference metric
Jiawei Liu, Lin Niu, Zhihang Yuan, Dawei Yang, Xinggang Wang, and Wenyu Liu. Pd-quant: Post-training quantiza- tion based on prediction difference metric. InProceedings of 9 the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24427–24437, 2023. 6, 4
work page 2023
-
[25]
Oscillation-free quantization for low-bit vision transformers
Shih-Yang Liu, Zechun Liu, and Kwang-Ting Cheng. Oscillation-free quantization for low-bit vision transformers. In International Conference on Machine Learning , pages 21813–21824. PMLR, 2023. 1
work page 2023
-
[26]
Pq-sam: Post-training quantization for segment any- thing model
Xiaoyu Liu, Xin Ding, Lei Yu, Yuanyuan Xi, Wei Li, Zhi- jun Tu, Jie Hu, Hanting Chen, Baoqun Yin, and Zhiwei Xiong. Pq-sam: Post-training quantization for segment any- thing model. In European Conference on Computer Vision, pages 420–437. Springer, 2024. 2
work page 2024
-
[27]
Noisyquant: Noisy bias-enhanced post-training activation quantization for vision transformers
Yijiang Liu, Huanrui Yang, Zhen Dong, Kurt Keutzer, Li Du, and Shanghang Zhang. Noisyquant: Noisy bias-enhanced post-training activation quantization for vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20321–20330, 2023. 2
work page 2023
-
[28]
Post-training quantization for vision trans- former
Zhenhua Liu, Yunhe Wang, Kai Han, Wei Zhang, Siwei Ma, and Wen Gao. Post-training quantization for vision trans- former. Advances in Neural Information Processing Systems, 34:28092–28103, 2021. 1
work page 2021
-
[29]
Ptq4sam: Post-training quantization for seg- ment anything
Chengtao Lv, Hong Chen, Jinyang Guo, Yifu Ding, and Xi- anglong Liu. Ptq4sam: Post-training quantization for seg- ment anything. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15941– 15951, 2024. 2, 3, 6, 7, 5
work page 2024
-
[30]
Follow anything: Open- set detection, tracking, and following in real-time
Alaa Maalouf, Ninad Jadhav, Krishna Murthy Jatavallab- hula, Makram Chahine, Daniel M V ogt, Robert J Wood, An- tonio Torralba, and Daniela Rus. Follow anything: Open- set detection, tracking, and following in real-time. IEEE Robotics and Automation Letters, 9(4):3283–3290, 2024. 1
work page 2024
-
[31]
Up or down? adap- tive rounding for post-training quantization
Markus Nagel, Rana Ali Amjad, Mart Van Baalen, Chris- tos Louizos, and Tijmen Blankevoort. Up or down? adap- tive rounding for post-training quantization. In International Conference on Machine Learning, pages 7197–7206. PMLR,
-
[32]
Faster r-cnn: Towards real-time object detection with region proposal networks
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE transactions on pattern analysis and machine intelligence, 39(6):1137–1149, 2016. 6
work page 2016
-
[33]
arXiv preprint arXiv:2304.10261 (2023)
Qiuhong Shen, Xingyi Yang, and Xinchao Wang. Anything- 3d: Towards single-view anything reconstruction in the wild. arXiv preprint arXiv:2304.10261, 2023. 1
-
[34]
Trio-vit: Post-training quantization and acceleration for softmax-free efficient vision transformer
Huihong Shi, Haikuo Shao, Wendong Mao, and Zhongfeng Wang. Trio-vit: Post-training quantization and acceleration for softmax-free efficient vision transformer. arXiv preprint arXiv:2405.03882, 2024. 2
-
[35]
Tinysam: Pushing the envelope for efficient segment any- thing model
Han Shu, Wenshuo Li, Yehui Tang, Yiman Zhang, Yi- hao Chen, Houqiang Li, Yunhe Wang, and Xinghao Chen. Tinysam: Pushing the envelope for efficient segment any- thing model. arXiv preprint arXiv:2312.13789, 2023. 2
-
[36]
Learnable lookup table for neural network quantization
Longguang Wang, Xiaoyu Dong, Yingqian Wang, Li Liu, Wei An, and Yu Kuen Guo. Learnable lookup table for neural network quantization. 2022 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 12413– 12423, 2022. 2, 4
work page 2022
-
[37]
Towards accurate post-training network quantization via bit- split and stitching
Peisong Wang, Qiang Chen, Xiangyu He, and Jian Cheng. Towards accurate post-training network quantization via bit- split and stitching. In Proceedings of the International Con- ference on Machine Learning, pages 9847–9856, 2020. 4
work page 2020
-
[38]
Detect any shadow: Segment anything for video shadow detection
Yonghui Wang, Wengang Zhou, Yunyao Mao, and Houqiang Li. Detect any shadow: Segment anything for video shadow detection. IEEE Transactions on Circuits and Systems for Video Technology, 34(5):3782–3794, 2024. 1
work page 2024
-
[39]
Qdrop: Randomly dropping quantization for extremely low-bit post-training quantization
Xiuying Wei, Ruihao Gong, Yuhang Li, Xianglong Liu, and Fengwei Yu. Qdrop: Randomly dropping quantization for extremely low-bit post-training quantization. arXiv preprint arXiv:2203.05740, 2022. 2, 3, 6, 4, 5
-
[40]
An energy-and-area- efficient cnn accelerator for universal powers-of-two quan- tization
Tian Xia, Boran Zhao, Jian Ma, Gelin Fu, Wenzhe Zhao, Nanning Zheng, and Pengju Ren. An energy-and-area- efficient cnn accelerator for universal powers-of-two quan- tization. IEEE Transactions on Circuits and Systems I: Reg- ular Papers, 70(3):1242–1255, 2022. 4
work page 2022
-
[41]
Ptq4vit: Post-training quantization for vision transformers with twin uniform quantization
Zhihang Yuan, Chenhao Xue, Yiqi Chen, Qiang Wu, and Guangyu Sun. Ptq4vit: Post-training quantization for vision transformers with twin uniform quantization. In European conference on computer vision , pages 191–207. Springer,
-
[42]
RPTQ: reorder-based post-training quantization for large language models
Zhihang Yuan, Lin Niu, Jiawei Liu, Wenyu Liu, Xinggang Wang, Yuzhang Shang, Guangyu Sun, Qiang Wu, Jiaxiang Wu, and Bingzhe Wu. Rptq: Reorder-based post-training quantization for large language models. arXiv preprint arXiv:2304.01089, 2023. 5
-
[43]
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
Hao Zhang, Feng Li, Shilong Liu, Lei Zhang, Hang Su, Jun Zhu, Lionel M Ni, and Heung-Yeung Shum. Dino: Detr with improved denoising anchor boxes for end-to-end object detection. arXiv preprint arXiv:2203.03605, 2022. 6
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[44]
Personalize segment anything model with one shot
Renrui Zhang, Zhengkai Jiang, Ziyu Guo, Shilin Yan, Junt- ing Pan, Xianzheng Ma, Hao Dong, Peng Gao, and Hong- sheng Li. Personalize segment anything model with one shot. arXiv preprint arXiv:2305.03048, 2023. 1
-
[45]
Less is more: Focus attention for efficient detr
Dehua Zheng, Wenhui Dong, Hailin Hu, Xinghao Chen, and Yunhe Wang. Less is more: Focus attention for efficient detr. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6674–6683, 2023. 1
work page 2023
-
[46]
Dy- namic dual trainable bounds for ultra-low precision super- resolution networks
Yunshan Zhong, Mingbao Lin, Xunchao Li, Ke Li, Yun- hang Shen, Fei Chao, Yongjian Wu, and Rongrong Ji. Dy- namic dual trainable bounds for ultra-low precision super- resolution networks. In European Conference on Computer Vision, pages 1–18. Springer, 2022. 4
work page 2022
-
[47]
I&s-vit: An inclusive & stable method for pushing the limit of post-training vits quantization
Yunshan Zhong, Jiawei Hu, Mingbao Lin, Mengzhao Chen, and Rongrong Ji. I&s-vit: An inclusive & stable method for pushing the limit of post-training vits quantization. arXiv preprint arXiv:2311.10126, 2023. 2, 4
-
[48]
Erq: Error reduction for post-training quanti- zation of vision transformers
Yunshan Zhong, Jiawei Hu, You Huang, Yuxin Zhang, and Rongrong Ji. Erq: Error reduction for post-training quanti- zation of vision transformers. In Proceedings of the Interna- tional Conference on Machine Learning (ICML), 2024. 2
work page 2024
-
[49]
Towards accurate post-training quantization of vision transformers via error reduction
Yunshan Zhong, You Huang, Jiawei Hu, Yuxin Zhang, and Rongrong Ji. Towards accurate post-training quantization of vision transformers via error reduction. IEEE Transactions on Pattern Analysis and Machine Intelligence , pages 1–18,
-
[50]
2 10 AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model Supplementary Material A. Analysis of Inter-Channel Variation and Inter-Sample Similarity in SAM Model In this section, we provide an in-depth analysis of inter- channel variation and inter-sample similarity using the SAM-B model with YOLOX as the prompt de...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.