AHCQ-SAM: Toward Accurate and Hardware-Compatible Post-Training Segment Anything Model Quantization

Kentaro Yoshioka; Shengchuan Zhang; Shimpei Ando; Weiqi Yan; Wenlun Zhang; Yunshan Zhong

arxiv: 2503.03088 · v4 · submitted 2025-03-05 · 💻 cs.CV · cs.AR· cs.LG

AHCQ-SAM: Toward Accurate and Hardware-Compatible Post-Training Segment Anything Model Quantization

Wenlun Zhang , Yunshan Zhong , Weiqi Yan , Shengchuan Zhang , Shimpei Ando , Kentaro Yoshioka This is my paper

Pith reviewed 2026-05-23 01:43 UTC · model grok-4.3

classification 💻 cs.CV cs.ARcs.LG

keywords post-training quantizationSegment Anything ModelSAM4-bit quantizationhardware-compatible quantizationmodel compressionFPGA implementationedge deployment

0 comments

The pith

AHCQ-SAM applies four targeted techniques to overcome specific quantization barriers in SAM, enabling higher-accuracy 4-bit models for segmentation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to demonstrate that post-training quantization of the Segment Anything Model can be made accurate and hardware-friendly by directly addressing four distinct problems in its weights and activations. It introduces a framework with four matching components that regularize weights, handle skewed values, group channels, and rescale attention scores. If these steps work as described, 4-bit versions of SAM and SAM2 would run with substantially less accuracy loss than prior methods while fitting edge hardware constraints. The reported gains appear on standard detection and segmentation benchmarks plus an FPGA test. This would matter for moving large zero-shot segmentation models from cloud servers to local devices.

Core claim

The central claim is that the four challenges—ill-conditioned weights, skewed post-GELU activations, inter-channel variance in linear layers, and heterogeneous attention scores—limit existing PTQ for SAM, and that the proposed Activation-aware Condition Number Reduction, Hybrid Log-Uniform Quantization, Channel-Aware Grouping, and Logarithmic Nonlinear Quantization together remove those limits, producing 15.2 percent higher mAP on COCO for 4-bit SAM-B and 14.01 percent higher J&F on SA-V for 4-bit SAM2-Tiny, plus measured FPGA speed and power gains.

What carries the argument

The AHCQ-SAM PTQ framework built from four synergistic components that each target one listed quantization challenge in SAM's architecture.

If this is right

4-bit SAM-B with Faster R-CNN reaches 15.2 percent higher mAP on COCO than the previous best PTQ method.
4-bit SAM2-Tiny reaches 14.01 percent higher J&F on the SA-V Test set than prior methods.
An FPGA implementation delivers 7.12 times speedup and 6.62 times better power efficiency versus the floating-point baseline.
The work supplies the first reported PTQ benchmark numbers for SAM2 models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same component design could be tested on other large vision transformers that share similar activation and attention patterns.
Direct integration of the channel-grouping and log-scale rules into hardware accelerators might reduce memory traffic further.
If the accuracy holds at 4 bits, on-device zero-shot segmentation becomes feasible on mobile or embedded processors without retraining.

Load-bearing premise

The four listed quantization challenges are the dominant performance limiters for existing PTQ on SAM and that the four proposed components mitigate them without introducing offsetting accuracy or hardware costs.

What would settle it

Measuring 4-bit SAM-B with Faster R-CNN on the COCO dataset and finding that mAP does not exceed the prior SOTA method by a clear margin would falsify the accuracy claim.

Figures

Figures reproduced from arXiv: 2503.03088 by Kentaro Yoshioka, Shengchuan Zhang, Shimpei Ando, Weiqi Yan, Wenlun Zhang, Yunshan Zhong.

**Figure 2.** Figure 2: AHCPTQ framework: HLUQ refines quantization resolution for post-GELU activations, while CAG effectively groups parame [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Cosine similarity of normalized quantization parameter [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Hardware cost analysis of the linear projection layer in [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Ablation study comparing the effectiveness of HLUQ [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Dependence of SAM performance on group number in [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 8.** Figure 8: Range distribution and cosine similarity of Linear-1 ac [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗

**Figure 7.** Figure 7: Range distribution and cosine similarity of QKV projec [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗

**Figure 10.** Figure 10: Range distribution and cosine similarity of pre [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗

**Figure 11.** Figure 11: Range distribution and cosine similarity of pre [PITH_FULL_IMAGE:figures/full_fig_p012_11.png] view at source ↗

**Figure 13.** Figure 13: Overview of the evaluation system and the accelerator [PITH_FULL_IMAGE:figures/full_fig_p013_13.png] view at source ↗

**Figure 14.** Figure 14: FPGA validation environment. uration: (1) a standard FP32 implementation, (2) a default INT8 implementation. In all designs, floating-point operations such as quantization and dequantization are handled by an IP generator utilizing on-chip DSP resources. The detailed experimental results are presented in Sec. 4.4. D. Experiment on Vision Transformers To ensure that AHCPTQ generalizes to other vision mode… view at source ↗

**Figure 15.** Figure 15: Qualitative comparison of segmentation masks generated by different quantization methods on SAM-B with YOLOX. Our [PITH_FULL_IMAGE:figures/full_fig_p015_15.png] view at source ↗

read the original abstract

The Segment Anything Model (SAM) has revolutionized image and video segmentation with its powerful zero-shot capabilities. However, its massive parameter scale and high computational demands hinder efficient deployment on resource-constrained edge devices. While Post-Training Quantization (PTQ) offers a practical solution, existing methods still fail to handle four critical quantization challenges: (1) ill-conditioned weights; (2) skewed and long-tailed post-GELU activations; (3) pronounced inter-channel variance in linear projections; and (4) exponentially scaled and heterogeneous attention scores. To mitigate these bottlenecks, we propose AHCQ-SAM, an accurate and hardware-compatible PTQ framework featuring four synergistic components: (1) Activation-aware Condition Number Reduction (ACNR), which regularizes weight matrices via a proximal point algorithm to suppress ill-conditioning; (2) Hybrid Log-Uniform Quantization (HLUQ), which combines power-of-two and uniform quantizers to capture skewed post-GELU activations; (3) Channel-Aware Grouping (CAG), which clusters channels with homogeneous statistics to achieve high accuracy with minimal hardware overhead; and (4) Logarithmic Nonlinear Quantization (LNQ), which utilizes logarithmic transformations to adaptively adjust quantization resolution for exponential and heterogeneous attention scores. Experimental results demonstrate that AHCQ-SAM outperforms current methods on SAM. Compared with the SOTA method, it achieves a 15.2% improvement in mAP for 4-bit SAM-B with Faster R-CNN on the COCO dataset. Furthermore, we establish a PTQ benchmark for SAM2, where AHCQ-SAM yields a 14.01% improvement in J&F for 4-bit SAM2-Tiny on the SA-V Test dataset. Finally, FPGA-based implementation validates the practical utility of AHCQ-SAM, delivering a 7.12x speedup and a 6.62x power efficiency improvement over the floating-point baseline.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

AHCQ-SAM offers four targeted quantization modules for SAM but the evidence linking them to the reported gains is weak without ablations or diagnostics.

read the letter

The main thing here is that the paper presents a new PTQ framework for SAM with four components targeting specific quantization challenges, claiming substantial accuracy improvements and hardware speedups, but the connection between the proposed fixes and the results is not clearly established. What is actually new is the set of four modules: Activation-aware Condition Number Reduction for weights, Hybrid Log-Uniform Quantization for activations, Channel-Aware Grouping for linear projections, and Logarithmic Nonlinear Quantization for attention scores. These are designed around SAM's statistics rather than generic approaches. The paper does well in applying this to both the original SAM and SAM2, reporting results on COCO with Faster R-CNN and on the SA-V dataset. The inclusion of FPGA results for speedup and power efficiency is useful for showing practical impact. The soft spots are the missing pieces in the validation. The abstract and description do not include ablations that isolate each component or direct measurements like condition numbers or activation distributions to verify the problems and solutions. Without those, it is difficult to rule out that the gains come from implementation details or data choices instead of the claimed mechanisms. The hardware claims also lack detail on any added costs from the logarithmic operations. This paper is for engineers and researchers working on quantizing large vision models for edge and mobile use. Readers who need ideas for handling skewed activations or attention in transformers could find the components interesting. It deserves a serious referee because the performance numbers are large and the hardware angle makes it relevant for real applications. I recommend sending it to peer review with a request for ablations and more diagnostic results to strengthen the central argument.

Referee Report

3 major / 2 minor

Summary. The paper presents AHCQ-SAM, a post-training quantization (PTQ) framework for SAM and SAM2 that identifies four quantization challenges (ill-conditioned weights, skewed post-GELU activations, inter-channel variance, heterogeneous attention scores) and proposes four components (ACNR via proximal point algorithm, HLUQ combining power-of-two and uniform quantizers, CAG for homogeneous channel clustering, LNQ with logarithmic transformations) to address them. It reports 15.2% mAP gain for 4-bit SAM-B on COCO with Faster R-CNN and 14.01% J&F gain for 4-bit SAM2-Tiny on SA-V Test, plus 7.12x FPGA speedup.

Significance. If substantiated, the work would advance practical PTQ for large vision transformers by targeting SAM-specific issues and establishing a SAM2 PTQ benchmark. The FPGA implementation provides a concrete hardware demonstration. Strengths include the attempt to link specific model properties (condition numbers, activation tails, attention heterogeneity) to quantization design.

major comments (3)

[Method (ACNR)] Method section (ACNR): no pre/post condition-number diagnostics or matrix-norm measurements are supplied to verify that the proximal-point regularization actually suppresses ill-conditioning rather than merely altering the loss landscape.
[Experiments] Experiments section: ablation tables isolating each of ACNR/HLUQ/CAG/LNQ versus a common PTQ baseline are absent, so the 15.2% mAP and 14.01% J&F gains cannot be attributed to the claimed mechanisms versus calibration-set choice or hyper-parameter search.
[Hardware evaluation] Hardware evaluation: the single FPGA result reports 7.12x speedup and 6.62x power efficiency but provides no cycle-accurate or LUT/BRAM overhead figures for the logarithmic and grouping operations introduced by HLUQ and CAG.

minor comments (2)

[Introduction] Abstract and §1: the four challenges are asserted as dominant without a supporting citation or preliminary measurement showing they exceed other known PTQ error sources for SAM.
[Method (HLUQ/LNQ)] Notation in HLUQ and LNQ: the hybrid quantizer and logarithmic mapping lack explicit equations defining the scale factors and breakpoints, hindering reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and indicate planned revisions.

read point-by-point responses

Referee: [Method (ACNR)] Method section (ACNR): no pre/post condition-number diagnostics or matrix-norm measurements are supplied to verify that the proximal-point regularization actually suppresses ill-conditioning rather than merely altering the loss landscape.

Authors: We agree that direct pre- and post-ACNR condition number and matrix norm measurements would strengthen validation of the proximal point algorithm's effect. The ACNR component is motivated by the ill-conditioning analysis in Section 3.1. In the revised manuscript we will add these diagnostics for representative weight matrices to confirm suppression of ill-conditioning. revision: yes
Referee: [Experiments] Experiments section: ablation tables isolating each of ACNR/HLUQ/CAG/LNQ versus a common PTQ baseline are absent, so the 15.2% mAP and 14.01% J&F gains cannot be attributed to the claimed mechanisms versus calibration-set choice or hyper-parameter search.

Authors: We acknowledge that component-wise ablations are needed to isolate contributions. While overall comparisons to prior PTQ methods are provided, we will add an ablation table in the revised manuscript that evaluates each technique (ACNR, HLUQ, CAG, LNQ) individually against a shared baseline to clarify attribution of the reported gains. revision: yes
Referee: [Hardware evaluation] Hardware evaluation: the single FPGA result reports 7.12x speedup and 6.62x power efficiency but provides no cycle-accurate or LUT/BRAM overhead figures for the logarithmic and grouping operations introduced by HLUQ and CAG.

Authors: We recognize that detailed overhead metrics for the logarithmic and grouping operations would better demonstrate hardware compatibility. Our FPGA results report end-to-end gains; we will add available LUT/BRAM utilization data for these operations in the revision, though cycle-accurate per-operation breakdowns may be limited by our current implementation setup. revision: partial

Circularity Check

0 steps flagged

No significant circularity; empirical results are independently measured

full rationale

The paper identifies four quantization challenges and proposes four corresponding modules (ACNR, HLUQ, CAG, LNQ) to address them, then reports measured accuracy gains on COCO and SA-V datasets. No equations appear in the provided text that would define any reported prediction or improvement as equivalent to a fitted parameter or input by construction. No self-citations are invoked to justify uniqueness theorems, ansatzes, or load-bearing premises. The central claims rest on external benchmark comparisons rather than reducing to self-referential definitions or statistical forcing from the same calibration data. This qualifies as a self-contained empirical contribution with no detectable circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; free parameters likely exist inside the proximal algorithm, grouping, and log transforms but are not enumerated. No invented entities or domain axioms are stated.

pith-pipeline@v0.9.0 · 5913 in / 1032 out tokens · 51901 ms · 2026-05-23T01:43:31.166214+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

When W4A4 Breaks Camouflaged Object Detection: Token-Group Dual-Constraint Activation Quantization
cs.CV 2026-04 unverdicted novelty 7.0

COD-TDQ uses token-group scaling and dual-constraint projection to fix 4-bit activation quantization for camouflaged object detection, delivering more than 0.12 higher Sα scores than prior methods on four benchmarks w...

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages · cited by 1 Pith paper · 3 internal anchors

[1]

Post train- ing 4-bit quantization of convolutional networks for rapid- deployment

Ron Banner, Yury Nahshan, Daniel Soudry, et al. Post train- ing 4-bit quantization of convolutional networks for rapid- deployment. In Proceedings of the Advances in Neural In- formation Processing Systems, pages 7950–7958, 2019. 4

work page 2019
[2]

Sam-med2d

Junlong Cheng, Jin Ye, Zhongying Deng, Jianpin Chen, Tianbin Li, Haoyu Wang, Yanzhou Su, Ziyan Huang, Ji- long Chen, Lei Jiang, et al. Sam-med2d. arXiv preprint arXiv:2308.16184, 2023. 1

work page arXiv 2023
[3]

Data-free network compression via parametric non-uniform mixed precision quantization

Vladimir Chikin and Mikhail Antiukh. Data-free network compression via parametric non-uniform mixed precision quantization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 450– 459, 2022. 2, 4

work page 2022
[4]

Towards accurate post- training quantization for vision transformer

Yifu Ding, Haotong Qin, Qinghua Yan, Zhenhua Chai, Junjie Liu, Xiaolin Wei, and Xianglong Liu. Towards accurate post- training quantization for vision transformer. In Proceedings of the 30th ACM international conference on multimedia , pages 5380–5388, 2022. 2, 4

work page 2022
[5]

K., McKinstry, J

Steven K Esser, Jeffrey L McKinstry, Deepika Bablani, Rathinakumar Appuswamy, and Dharmendra S Modha. Learned step size quantization. arXiv preprint arXiv:1902.08153, 2019. 1

work page arXiv 1902
[6]

Jumping through local minima: Quantization in the loss landscape of vision transformers

Natalia Frumkin, Dibakar Gope, and Diana Marculescu. Jumping through local minima: Quantization in the loss landscape of vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision , pages 16978–16988, 2023. 2

work page 2023
[7]

YOLOX: Exceeding YOLO Series in 2021

Z Ge. Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430, 2021. 6

work page internal anchor Pith review Pith/arXiv arXiv 2021
[8]

Differ- entiable soft quantization: Bridging full-precision and low- bit neural networks

Ruihao Gong, Xianglong Liu, Shenghu Jiang, Tianxiang Li, Peng Hu, Jiazhen Lin, Fengwei Yu, and Junjie Yan. Differ- entiable soft quantization: Bridging full-precision and low- bit neural networks. In Proceedings of the IEEE/CVF inter- national conference on computer vision , pages 4852–4861,

work page
[9]

Daq: distribution-aware quantization for deep image super-resolution networks

Cheeun Hong, Heewon Kim, Junghun Oh, and Ky- oung Mu Lee. Daq: distribution-aware quantization for deep image super-resolution networks. arXiv preprint arXiv:2012.11230, 2020. 2, 4

work page arXiv 2012
[10]

1.1 computing’s energy problem (and what we can do about it)

Mark Horowitz. 1.1 computing’s energy problem (and what we can do about it). In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC) , pages 10–14, 2014. 4, 2

work page 2014
[11]

Multi-dimensional vi- sion transformer compression via dependency guided gaus- sian process search

Zejiang Hou and Sun-Yuan Kung. Multi-dimensional vi- sion transformer compression via dependency guided gaus- sian process search. In Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 3669–3678, 2022. 1

work page 2022
[12]

Detrs with hybrid matching

Ding Jia, Yuhui Yuan, Haodi He, Xiaopei Wu, Haojun Yu, Weihong Lin, Lei Sun, Chao Zhang, and Han Hu. Detrs with hybrid matching. In Proceedings of the IEEE/CVF con- ference on computer vision and pattern recognition , pages 19702–19712, 2023. 6

work page 2023
[13]

Berg, Wan-Yen Lo, Piotr Doll ´ar, and Ross B

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chlo´e Rolland, Laura Gustafson, Tete Xiao, Spencer White- head, Alexander C. Berg, Wan-Yen Lo, Piotr Doll ´ar, and Ross B. Girshick. Segment anything. 2023 IEEE/CVF In- ternational Conference on Computer Vision (ICCV) , pages 3992–4003, 2023. 1, 2

work page 2023
[14]

Quantizing deep convolutional networks for efficient inference: A whitepaper

Raghuraman Krishnamoorthi. Quantizing deep convolu- tional networks for efficient inference: A whitepaper. arXiv preprint arXiv:1806.08342, 2018. 1

work page internal anchor Pith review Pith/arXiv arXiv 2018
[16]

Additive powers-of- two quantization: An efficient non-uniform discretization for neural networks

Yuhang Li, Xin Dong, and Wei Wang. Additive powers-of- two quantization: An efficient non-uniform discretization for neural networks. arXiv preprint arXiv:1909.13144, 2019. 1

work page arXiv 1909
[17]

Brecq: Pushing the limit of post-training quantization by block reconstruction.arXiv preprint arXiv:2102.05426,

Yuhang Li, Ruihao Gong, Xu Tan, Yang Yang, Peng Hu, Qi Zhang, Fengwei Yu, Wei Wang, and Shi Gu. Brecq: Pushing the limit of post-training quantization by block reconstruc- tion. arXiv preprint arXiv:2102.05426, 2021. 2, 6, 4, 5

work page arXiv 2021
[18]

Q-vit: Accurate and fully quantized low-bit vision transformer

Yanjing Li, Sheng Xu, Baochang Zhang, Xianbin Cao, Peng Gao, and Guodong Guo. Q-vit: Accurate and fully quantized low-bit vision transformer. Advances in neural information processing systems, 35:34451–34463, 2022. 1

work page 2022
[19]

I-vit: Integer-only quantization for efficient vision transformer inference

Zhikai Li and Qingyi Gu. I-vit: Integer-only quantization for efficient vision transformer inference. In Proceedings of the IEEE/CVF International Conference on Computer Vi- sion, pages 17065–17075, 2023. 1

work page 2023
[20]

Patch similarity aware data-free quantization for vision transformers

Zhikai Li, Liping Ma, Mengjuan Chen, Junrui Xiao, and Qingyi Gu. Patch similarity aware data-free quantization for vision transformers. In European conference on computer vision, pages 154–170. Springer, 2022. 2

work page 2022
[21]

Repq- vit: Scale reparameterization for post-training quantization of vision transformers

Zhikai Li, Junrui Xiao, Lianwei Yang, and Qingyi Gu. Repq- vit: Scale reparameterization for post-training quantization of vision transformers. In Proceedings of the IEEE/CVF In- ternational Conference on Computer Vision , pages 17227– 17236, 2023. 2, 4

work page 2023
[22]

Microsoft coco: Common objects in context

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll´ar, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014. 6

work page 2014
[23]

Fq-vit: Post-training quantization for fully quantized vision transformer

Yang Lin, Tianyu Zhang, Peiqin Sun, Zheng Li, and Shuchang Zhou. Fq-vit: Post-training quantization for fully quantized vision transformer. arXiv preprint arXiv:2111.13824, 2021. 2, 4

work page arXiv 2021
[24]

Pd-quant: Post-training quantiza- tion based on prediction difference metric

Jiawei Liu, Lin Niu, Zhihang Yuan, Dawei Yang, Xinggang Wang, and Wenyu Liu. Pd-quant: Post-training quantiza- tion based on prediction difference metric. InProceedings of 9 the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24427–24437, 2023. 6, 4

work page 2023
[25]

Oscillation-free quantization for low-bit vision transformers

Shih-Yang Liu, Zechun Liu, and Kwang-Ting Cheng. Oscillation-free quantization for low-bit vision transformers. In International Conference on Machine Learning , pages 21813–21824. PMLR, 2023. 1

work page 2023
[26]

Pq-sam: Post-training quantization for segment any- thing model

Xiaoyu Liu, Xin Ding, Lei Yu, Yuanyuan Xi, Wei Li, Zhi- jun Tu, Jie Hu, Hanting Chen, Baoqun Yin, and Zhiwei Xiong. Pq-sam: Post-training quantization for segment any- thing model. In European Conference on Computer Vision, pages 420–437. Springer, 2024. 2

work page 2024
[27]

Noisyquant: Noisy bias-enhanced post-training activation quantization for vision transformers

Yijiang Liu, Huanrui Yang, Zhen Dong, Kurt Keutzer, Li Du, and Shanghang Zhang. Noisyquant: Noisy bias-enhanced post-training activation quantization for vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20321–20330, 2023. 2

work page 2023
[28]

Post-training quantization for vision trans- former

Zhenhua Liu, Yunhe Wang, Kai Han, Wei Zhang, Siwei Ma, and Wen Gao. Post-training quantization for vision trans- former. Advances in Neural Information Processing Systems, 34:28092–28103, 2021. 1

work page 2021
[29]

Ptq4sam: Post-training quantization for seg- ment anything

Chengtao Lv, Hong Chen, Jinyang Guo, Yifu Ding, and Xi- anglong Liu. Ptq4sam: Post-training quantization for seg- ment anything. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15941– 15951, 2024. 2, 3, 6, 7, 5

work page 2024
[30]

Follow anything: Open- set detection, tracking, and following in real-time

Alaa Maalouf, Ninad Jadhav, Krishna Murthy Jatavallab- hula, Makram Chahine, Daniel M V ogt, Robert J Wood, An- tonio Torralba, and Daniela Rus. Follow anything: Open- set detection, tracking, and following in real-time. IEEE Robotics and Automation Letters, 9(4):3283–3290, 2024. 1

work page 2024
[31]

Up or down? adap- tive rounding for post-training quantization

Markus Nagel, Rana Ali Amjad, Mart Van Baalen, Chris- tos Louizos, and Tijmen Blankevoort. Up or down? adap- tive rounding for post-training quantization. In International Conference on Machine Learning, pages 7197–7206. PMLR,

work page
[32]

Faster r-cnn: Towards real-time object detection with region proposal networks

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE transactions on pattern analysis and machine intelligence, 39(6):1137–1149, 2016. 6

work page 2016
[33]

arXiv preprint arXiv:2304.10261 (2023)

Qiuhong Shen, Xingyi Yang, and Xinchao Wang. Anything- 3d: Towards single-view anything reconstruction in the wild. arXiv preprint arXiv:2304.10261, 2023. 1

work page arXiv 2023
[34]

Trio-vit: Post-training quantization and acceleration for softmax-free efficient vision transformer

Huihong Shi, Haikuo Shao, Wendong Mao, and Zhongfeng Wang. Trio-vit: Post-training quantization and acceleration for softmax-free efficient vision transformer. arXiv preprint arXiv:2405.03882, 2024. 2

work page arXiv 2024
[35]

Tinysam: Pushing the envelope for efficient segment any- thing model

Han Shu, Wenshuo Li, Yehui Tang, Yiman Zhang, Yi- hao Chen, Houqiang Li, Yunhe Wang, and Xinghao Chen. Tinysam: Pushing the envelope for efficient segment any- thing model. arXiv preprint arXiv:2312.13789, 2023. 2

work page arXiv 2023
[36]

Learnable lookup table for neural network quantization

Longguang Wang, Xiaoyu Dong, Yingqian Wang, Li Liu, Wei An, and Yu Kuen Guo. Learnable lookup table for neural network quantization. 2022 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 12413– 12423, 2022. 2, 4

work page 2022
[37]

Towards accurate post-training network quantization via bit- split and stitching

Peisong Wang, Qiang Chen, Xiangyu He, and Jian Cheng. Towards accurate post-training network quantization via bit- split and stitching. In Proceedings of the International Con- ference on Machine Learning, pages 9847–9856, 2020. 4

work page 2020
[38]

Detect any shadow: Segment anything for video shadow detection

Yonghui Wang, Wengang Zhou, Yunyao Mao, and Houqiang Li. Detect any shadow: Segment anything for video shadow detection. IEEE Transactions on Circuits and Systems for Video Technology, 34(5):3782–3794, 2024. 1

work page 2024
[39]

Qdrop: Randomly dropping quantization for extremely low-bit post-training quantization

Xiuying Wei, Ruihao Gong, Yuhang Li, Xianglong Liu, and Fengwei Yu. Qdrop: Randomly dropping quantization for extremely low-bit post-training quantization. arXiv preprint arXiv:2203.05740, 2022. 2, 3, 6, 4, 5

work page arXiv 2022
[40]

An energy-and-area- efficient cnn accelerator for universal powers-of-two quan- tization

Tian Xia, Boran Zhao, Jian Ma, Gelin Fu, Wenzhe Zhao, Nanning Zheng, and Pengju Ren. An energy-and-area- efficient cnn accelerator for universal powers-of-two quan- tization. IEEE Transactions on Circuits and Systems I: Reg- ular Papers, 70(3):1242–1255, 2022. 4

work page 2022
[41]

Ptq4vit: Post-training quantization for vision transformers with twin uniform quantization

Zhihang Yuan, Chenhao Xue, Yiqi Chen, Qiang Wu, and Guangyu Sun. Ptq4vit: Post-training quantization for vision transformers with twin uniform quantization. In European conference on computer vision , pages 191–207. Springer,

work page
[42]

RPTQ: reorder-based post-training quantization for large language models

Zhihang Yuan, Lin Niu, Jiawei Liu, Wenyu Liu, Xinggang Wang, Yuzhang Shang, Guangyu Sun, Qiang Wu, Jiaxiang Wu, and Bingzhe Wu. Rptq: Reorder-based post-training quantization for large language models. arXiv preprint arXiv:2304.01089, 2023. 5

work page arXiv 2023
[43]

DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection

Hao Zhang, Feng Li, Shilong Liu, Lei Zhang, Hang Su, Jun Zhu, Lionel M Ni, and Heung-Yeung Shum. Dino: Detr with improved denoising anchor boxes for end-to-end object detection. arXiv preprint arXiv:2203.03605, 2022. 6

work page internal anchor Pith review Pith/arXiv arXiv 2022
[44]

Personalize segment anything model with one shot

Renrui Zhang, Zhengkai Jiang, Ziyu Guo, Shilin Yan, Junt- ing Pan, Xianzheng Ma, Hao Dong, Peng Gao, and Hong- sheng Li. Personalize segment anything model with one shot. arXiv preprint arXiv:2305.03048, 2023. 1

work page arXiv 2023
[45]

Less is more: Focus attention for efficient detr

Dehua Zheng, Wenhui Dong, Hailin Hu, Xinghao Chen, and Yunhe Wang. Less is more: Focus attention for efficient detr. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6674–6683, 2023. 1

work page 2023
[46]

Dy- namic dual trainable bounds for ultra-low precision super- resolution networks

Yunshan Zhong, Mingbao Lin, Xunchao Li, Ke Li, Yun- hang Shen, Fei Chao, Yongjian Wu, and Rongrong Ji. Dy- namic dual trainable bounds for ultra-low precision super- resolution networks. In European Conference on Computer Vision, pages 1–18. Springer, 2022. 4

work page 2022
[47]

I&s-vit: An inclusive & stable method for pushing the limit of post-training vits quantization

Yunshan Zhong, Jiawei Hu, Mingbao Lin, Mengzhao Chen, and Rongrong Ji. I&s-vit: An inclusive & stable method for pushing the limit of post-training vits quantization. arXiv preprint arXiv:2311.10126, 2023. 2, 4

work page arXiv 2023
[48]

Erq: Error reduction for post-training quanti- zation of vision transformers

Yunshan Zhong, Jiawei Hu, You Huang, Yuxin Zhang, and Rongrong Ji. Erq: Error reduction for post-training quanti- zation of vision transformers. In Proceedings of the Interna- tional Conference on Machine Learning (ICML), 2024. 2

work page 2024
[49]

Towards accurate post-training quantization of vision transformers via error reduction

Yunshan Zhong, You Huang, Jiawei Hu, Yuxin Zhang, and Rongrong Ji. Towards accurate post-training quantization of vision transformers via error reduction. IEEE Transactions on Pattern Analysis and Machine Intelligence , pages 1–18,

work page
[50]

2 10 AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model Supplementary Material A. Analysis of Inter-Channel Variation and Inter-Sample Similarity in SAM Model In this section, we provide an in-depth analysis of inter- channel variation and inter-sample similarity using the SAM-B model with YOLOX as the prompt de...

work page

[1] [1]

Post train- ing 4-bit quantization of convolutional networks for rapid- deployment

Ron Banner, Yury Nahshan, Daniel Soudry, et al. Post train- ing 4-bit quantization of convolutional networks for rapid- deployment. In Proceedings of the Advances in Neural In- formation Processing Systems, pages 7950–7958, 2019. 4

work page 2019

[2] [2]

Sam-med2d

Junlong Cheng, Jin Ye, Zhongying Deng, Jianpin Chen, Tianbin Li, Haoyu Wang, Yanzhou Su, Ziyan Huang, Ji- long Chen, Lei Jiang, et al. Sam-med2d. arXiv preprint arXiv:2308.16184, 2023. 1

work page arXiv 2023

[3] [3]

Data-free network compression via parametric non-uniform mixed precision quantization

Vladimir Chikin and Mikhail Antiukh. Data-free network compression via parametric non-uniform mixed precision quantization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 450– 459, 2022. 2, 4

work page 2022

[4] [4]

Towards accurate post- training quantization for vision transformer

Yifu Ding, Haotong Qin, Qinghua Yan, Zhenhua Chai, Junjie Liu, Xiaolin Wei, and Xianglong Liu. Towards accurate post- training quantization for vision transformer. In Proceedings of the 30th ACM international conference on multimedia , pages 5380–5388, 2022. 2, 4

work page 2022

[5] [5]

K., McKinstry, J

Steven K Esser, Jeffrey L McKinstry, Deepika Bablani, Rathinakumar Appuswamy, and Dharmendra S Modha. Learned step size quantization. arXiv preprint arXiv:1902.08153, 2019. 1

work page arXiv 1902

[6] [6]

Jumping through local minima: Quantization in the loss landscape of vision transformers

Natalia Frumkin, Dibakar Gope, and Diana Marculescu. Jumping through local minima: Quantization in the loss landscape of vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision , pages 16978–16988, 2023. 2

work page 2023

[7] [7]

YOLOX: Exceeding YOLO Series in 2021

Z Ge. Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430, 2021. 6

work page internal anchor Pith review Pith/arXiv arXiv 2021

[8] [8]

Differ- entiable soft quantization: Bridging full-precision and low- bit neural networks

Ruihao Gong, Xianglong Liu, Shenghu Jiang, Tianxiang Li, Peng Hu, Jiazhen Lin, Fengwei Yu, and Junjie Yan. Differ- entiable soft quantization: Bridging full-precision and low- bit neural networks. In Proceedings of the IEEE/CVF inter- national conference on computer vision , pages 4852–4861,

work page

[9] [9]

Daq: distribution-aware quantization for deep image super-resolution networks

Cheeun Hong, Heewon Kim, Junghun Oh, and Ky- oung Mu Lee. Daq: distribution-aware quantization for deep image super-resolution networks. arXiv preprint arXiv:2012.11230, 2020. 2, 4

work page arXiv 2012

[10] [10]

1.1 computing’s energy problem (and what we can do about it)

Mark Horowitz. 1.1 computing’s energy problem (and what we can do about it). In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC) , pages 10–14, 2014. 4, 2

work page 2014

[11] [11]

Multi-dimensional vi- sion transformer compression via dependency guided gaus- sian process search

Zejiang Hou and Sun-Yuan Kung. Multi-dimensional vi- sion transformer compression via dependency guided gaus- sian process search. In Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 3669–3678, 2022. 1

work page 2022

[12] [12]

Detrs with hybrid matching

Ding Jia, Yuhui Yuan, Haodi He, Xiaopei Wu, Haojun Yu, Weihong Lin, Lei Sun, Chao Zhang, and Han Hu. Detrs with hybrid matching. In Proceedings of the IEEE/CVF con- ference on computer vision and pattern recognition , pages 19702–19712, 2023. 6

work page 2023

[13] [13]

Berg, Wan-Yen Lo, Piotr Doll ´ar, and Ross B

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chlo´e Rolland, Laura Gustafson, Tete Xiao, Spencer White- head, Alexander C. Berg, Wan-Yen Lo, Piotr Doll ´ar, and Ross B. Girshick. Segment anything. 2023 IEEE/CVF In- ternational Conference on Computer Vision (ICCV) , pages 3992–4003, 2023. 1, 2

work page 2023

[14] [14]

Quantizing deep convolutional networks for efficient inference: A whitepaper

Raghuraman Krishnamoorthi. Quantizing deep convolu- tional networks for efficient inference: A whitepaper. arXiv preprint arXiv:1806.08342, 2018. 1

work page internal anchor Pith review Pith/arXiv arXiv 2018

[15] [16]

Additive powers-of- two quantization: An efficient non-uniform discretization for neural networks

Yuhang Li, Xin Dong, and Wei Wang. Additive powers-of- two quantization: An efficient non-uniform discretization for neural networks. arXiv preprint arXiv:1909.13144, 2019. 1

work page arXiv 1909

[16] [17]

Brecq: Pushing the limit of post-training quantization by block reconstruction.arXiv preprint arXiv:2102.05426,

Yuhang Li, Ruihao Gong, Xu Tan, Yang Yang, Peng Hu, Qi Zhang, Fengwei Yu, Wei Wang, and Shi Gu. Brecq: Pushing the limit of post-training quantization by block reconstruc- tion. arXiv preprint arXiv:2102.05426, 2021. 2, 6, 4, 5

work page arXiv 2021

[17] [18]

Q-vit: Accurate and fully quantized low-bit vision transformer

Yanjing Li, Sheng Xu, Baochang Zhang, Xianbin Cao, Peng Gao, and Guodong Guo. Q-vit: Accurate and fully quantized low-bit vision transformer. Advances in neural information processing systems, 35:34451–34463, 2022. 1

work page 2022

[18] [19]

I-vit: Integer-only quantization for efficient vision transformer inference

Zhikai Li and Qingyi Gu. I-vit: Integer-only quantization for efficient vision transformer inference. In Proceedings of the IEEE/CVF International Conference on Computer Vi- sion, pages 17065–17075, 2023. 1

work page 2023

[19] [20]

Patch similarity aware data-free quantization for vision transformers

Zhikai Li, Liping Ma, Mengjuan Chen, Junrui Xiao, and Qingyi Gu. Patch similarity aware data-free quantization for vision transformers. In European conference on computer vision, pages 154–170. Springer, 2022. 2

work page 2022

[20] [21]

Repq- vit: Scale reparameterization for post-training quantization of vision transformers

Zhikai Li, Junrui Xiao, Lianwei Yang, and Qingyi Gu. Repq- vit: Scale reparameterization for post-training quantization of vision transformers. In Proceedings of the IEEE/CVF In- ternational Conference on Computer Vision , pages 17227– 17236, 2023. 2, 4

work page 2023

[21] [22]

Microsoft coco: Common objects in context

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll´ar, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014. 6

work page 2014

[22] [23]

Fq-vit: Post-training quantization for fully quantized vision transformer

Yang Lin, Tianyu Zhang, Peiqin Sun, Zheng Li, and Shuchang Zhou. Fq-vit: Post-training quantization for fully quantized vision transformer. arXiv preprint arXiv:2111.13824, 2021. 2, 4

work page arXiv 2021

[23] [24]

Pd-quant: Post-training quantiza- tion based on prediction difference metric

Jiawei Liu, Lin Niu, Zhihang Yuan, Dawei Yang, Xinggang Wang, and Wenyu Liu. Pd-quant: Post-training quantiza- tion based on prediction difference metric. InProceedings of 9 the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24427–24437, 2023. 6, 4

work page 2023

[24] [25]

Oscillation-free quantization for low-bit vision transformers

Shih-Yang Liu, Zechun Liu, and Kwang-Ting Cheng. Oscillation-free quantization for low-bit vision transformers. In International Conference on Machine Learning , pages 21813–21824. PMLR, 2023. 1

work page 2023

[25] [26]

Pq-sam: Post-training quantization for segment any- thing model

Xiaoyu Liu, Xin Ding, Lei Yu, Yuanyuan Xi, Wei Li, Zhi- jun Tu, Jie Hu, Hanting Chen, Baoqun Yin, and Zhiwei Xiong. Pq-sam: Post-training quantization for segment any- thing model. In European Conference on Computer Vision, pages 420–437. Springer, 2024. 2

work page 2024

[26] [27]

Noisyquant: Noisy bias-enhanced post-training activation quantization for vision transformers

Yijiang Liu, Huanrui Yang, Zhen Dong, Kurt Keutzer, Li Du, and Shanghang Zhang. Noisyquant: Noisy bias-enhanced post-training activation quantization for vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20321–20330, 2023. 2

work page 2023

[27] [28]

Post-training quantization for vision trans- former

Zhenhua Liu, Yunhe Wang, Kai Han, Wei Zhang, Siwei Ma, and Wen Gao. Post-training quantization for vision trans- former. Advances in Neural Information Processing Systems, 34:28092–28103, 2021. 1

work page 2021

[28] [29]

Ptq4sam: Post-training quantization for seg- ment anything

Chengtao Lv, Hong Chen, Jinyang Guo, Yifu Ding, and Xi- anglong Liu. Ptq4sam: Post-training quantization for seg- ment anything. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15941– 15951, 2024. 2, 3, 6, 7, 5

work page 2024

[29] [30]

Follow anything: Open- set detection, tracking, and following in real-time

Alaa Maalouf, Ninad Jadhav, Krishna Murthy Jatavallab- hula, Makram Chahine, Daniel M V ogt, Robert J Wood, An- tonio Torralba, and Daniela Rus. Follow anything: Open- set detection, tracking, and following in real-time. IEEE Robotics and Automation Letters, 9(4):3283–3290, 2024. 1

work page 2024

[30] [31]

Up or down? adap- tive rounding for post-training quantization

Markus Nagel, Rana Ali Amjad, Mart Van Baalen, Chris- tos Louizos, and Tijmen Blankevoort. Up or down? adap- tive rounding for post-training quantization. In International Conference on Machine Learning, pages 7197–7206. PMLR,

work page

[31] [32]

Faster r-cnn: Towards real-time object detection with region proposal networks

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE transactions on pattern analysis and machine intelligence, 39(6):1137–1149, 2016. 6

work page 2016

[32] [33]

arXiv preprint arXiv:2304.10261 (2023)

Qiuhong Shen, Xingyi Yang, and Xinchao Wang. Anything- 3d: Towards single-view anything reconstruction in the wild. arXiv preprint arXiv:2304.10261, 2023. 1

work page arXiv 2023

[33] [34]

Trio-vit: Post-training quantization and acceleration for softmax-free efficient vision transformer

Huihong Shi, Haikuo Shao, Wendong Mao, and Zhongfeng Wang. Trio-vit: Post-training quantization and acceleration for softmax-free efficient vision transformer. arXiv preprint arXiv:2405.03882, 2024. 2

work page arXiv 2024

[34] [35]

Tinysam: Pushing the envelope for efficient segment any- thing model

Han Shu, Wenshuo Li, Yehui Tang, Yiman Zhang, Yi- hao Chen, Houqiang Li, Yunhe Wang, and Xinghao Chen. Tinysam: Pushing the envelope for efficient segment any- thing model. arXiv preprint arXiv:2312.13789, 2023. 2

work page arXiv 2023

[35] [36]

Learnable lookup table for neural network quantization

Longguang Wang, Xiaoyu Dong, Yingqian Wang, Li Liu, Wei An, and Yu Kuen Guo. Learnable lookup table for neural network quantization. 2022 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 12413– 12423, 2022. 2, 4

work page 2022

[36] [37]

Towards accurate post-training network quantization via bit- split and stitching

Peisong Wang, Qiang Chen, Xiangyu He, and Jian Cheng. Towards accurate post-training network quantization via bit- split and stitching. In Proceedings of the International Con- ference on Machine Learning, pages 9847–9856, 2020. 4

work page 2020

[37] [38]

Detect any shadow: Segment anything for video shadow detection

Yonghui Wang, Wengang Zhou, Yunyao Mao, and Houqiang Li. Detect any shadow: Segment anything for video shadow detection. IEEE Transactions on Circuits and Systems for Video Technology, 34(5):3782–3794, 2024. 1

work page 2024

[38] [39]

Qdrop: Randomly dropping quantization for extremely low-bit post-training quantization

Xiuying Wei, Ruihao Gong, Yuhang Li, Xianglong Liu, and Fengwei Yu. Qdrop: Randomly dropping quantization for extremely low-bit post-training quantization. arXiv preprint arXiv:2203.05740, 2022. 2, 3, 6, 4, 5

work page arXiv 2022

[39] [40]

An energy-and-area- efficient cnn accelerator for universal powers-of-two quan- tization

Tian Xia, Boran Zhao, Jian Ma, Gelin Fu, Wenzhe Zhao, Nanning Zheng, and Pengju Ren. An energy-and-area- efficient cnn accelerator for universal powers-of-two quan- tization. IEEE Transactions on Circuits and Systems I: Reg- ular Papers, 70(3):1242–1255, 2022. 4

work page 2022

[40] [41]

Ptq4vit: Post-training quantization for vision transformers with twin uniform quantization

Zhihang Yuan, Chenhao Xue, Yiqi Chen, Qiang Wu, and Guangyu Sun. Ptq4vit: Post-training quantization for vision transformers with twin uniform quantization. In European conference on computer vision , pages 191–207. Springer,

work page

[41] [42]

RPTQ: reorder-based post-training quantization for large language models

Zhihang Yuan, Lin Niu, Jiawei Liu, Wenyu Liu, Xinggang Wang, Yuzhang Shang, Guangyu Sun, Qiang Wu, Jiaxiang Wu, and Bingzhe Wu. Rptq: Reorder-based post-training quantization for large language models. arXiv preprint arXiv:2304.01089, 2023. 5

work page arXiv 2023

[42] [43]

DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection

Hao Zhang, Feng Li, Shilong Liu, Lei Zhang, Hang Su, Jun Zhu, Lionel M Ni, and Heung-Yeung Shum. Dino: Detr with improved denoising anchor boxes for end-to-end object detection. arXiv preprint arXiv:2203.03605, 2022. 6

work page internal anchor Pith review Pith/arXiv arXiv 2022

[43] [44]

Personalize segment anything model with one shot

Renrui Zhang, Zhengkai Jiang, Ziyu Guo, Shilin Yan, Junt- ing Pan, Xianzheng Ma, Hao Dong, Peng Gao, and Hong- sheng Li. Personalize segment anything model with one shot. arXiv preprint arXiv:2305.03048, 2023. 1

work page arXiv 2023

[44] [45]

Less is more: Focus attention for efficient detr

Dehua Zheng, Wenhui Dong, Hailin Hu, Xinghao Chen, and Yunhe Wang. Less is more: Focus attention for efficient detr. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6674–6683, 2023. 1

work page 2023

[45] [46]

Dy- namic dual trainable bounds for ultra-low precision super- resolution networks

Yunshan Zhong, Mingbao Lin, Xunchao Li, Ke Li, Yun- hang Shen, Fei Chao, Yongjian Wu, and Rongrong Ji. Dy- namic dual trainable bounds for ultra-low precision super- resolution networks. In European Conference on Computer Vision, pages 1–18. Springer, 2022. 4

work page 2022

[46] [47]

I&s-vit: An inclusive & stable method for pushing the limit of post-training vits quantization

Yunshan Zhong, Jiawei Hu, Mingbao Lin, Mengzhao Chen, and Rongrong Ji. I&s-vit: An inclusive & stable method for pushing the limit of post-training vits quantization. arXiv preprint arXiv:2311.10126, 2023. 2, 4

work page arXiv 2023

[47] [48]

Erq: Error reduction for post-training quanti- zation of vision transformers

Yunshan Zhong, Jiawei Hu, You Huang, Yuxin Zhang, and Rongrong Ji. Erq: Error reduction for post-training quanti- zation of vision transformers. In Proceedings of the Interna- tional Conference on Machine Learning (ICML), 2024. 2

work page 2024

[48] [49]

Towards accurate post-training quantization of vision transformers via error reduction

Yunshan Zhong, You Huang, Jiawei Hu, Yuxin Zhang, and Rongrong Ji. Towards accurate post-training quantization of vision transformers via error reduction. IEEE Transactions on Pattern Analysis and Machine Intelligence , pages 1–18,

work page

[49] [50]

2 10 AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model Supplementary Material A. Analysis of Inter-Channel Variation and Inter-Sample Similarity in SAM Model In this section, we provide an in-depth analysis of inter- channel variation and inter-sample similarity using the SAM-B model with YOLOX as the prompt de...

work page