Decision-Aware Quadratic ReLU Replacement for HE-Friendly Inference

Rui Li; Weijie Miao; Wenyuan Wu

arxiv: 2605.22237 · v2 · pith:QBDIFR73new · submitted 2026-05-21 · 💻 cs.CR · cs.LG

Decision-Aware Quadratic ReLU Replacement for HE-Friendly Inference

Rui Li , Wenyuan Wu , Weijie Miao This is my paper

Pith reviewed 2026-05-25 06:00 UTC · model grok-4.3

classification 💻 cs.CR cs.LG

keywords homomorphic encryptionReLU replacementquadratic polynomialdecision preservationneural network inferenceCKKSconvex hull relaxation

0 comments

The pith

Quadratic polynomials replace ReLU while preserving decisions on a calibration set for homomorphic encryption inference.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a decision-aware method to swap ReLU activations for quadratic polynomials in single-hidden-layer neural networks without retraining. When a calibration set is positive-margin separable after lifting to quadratic features, the coefficients are found by solving a linear separation problem that gives necessary and sufficient conditions for exact decision preservation. When boundary samples prevent separation, reduced convex hulls and soft-margin relaxations produce coefficients that still agree with the original decisions on most calibration points. The approach is evaluated under the CKKS encryption scheme and yields faster inference that matches the original top-1 accuracy.

Core claim

For calibration sets positive-margin separable in the lifted space, quadratic replacement reduces to a linear separation problem that supplies both necessary and sufficient conditions for calibration-lossless replacement and a constructive algorithm for the coefficients. When the positive-margin condition fails because a few near-boundary samples bring the lifted hulls into contact, reduced convex hulls and Lagrangian-dual soft-margin relaxations cap the weight any single sample can carry and convert the task into smaller convex quadratic programs that produce coefficients with high empirical agreement on calibration-set decisions. At the maximal weight cap the relaxation recovers standard凸体

What carries the argument

Lifted-space linear separation that formulates quadratic ReLU replacement as a convex separation task, extended continuously by reduced-convex-hull relaxations when exact separation fails.

If this is right

Exact decision preservation holds if and only if the lifted calibration points admit positive-margin separation.
Reduced-convex-hull relaxations produce usable coefficients even when a few samples cause the hulls to touch.
Under CKKS the quadratic activations run 3.7-4.1 times faster than degree-7 Remez polynomials in the activation module.
End-to-end inference is 1.18-1.68 times faster than the higher-degree baseline while matching plaintext top-1 accuracy.
No retraining is required; only the calibration set is used to compute the replacement coefficients.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The separation view could be applied layer by layer in deeper networks if calibration sets are constructed per layer.
Larger or more diverse calibration sets would raise the probability that the exact positive-margin condition holds.
Measuring decision agreement on test data versus calibration data would quantify how well the finite-set assumption generalizes.

Load-bearing premise

That matching the original decisions exactly on a finite calibration set is enough to make the quadratic replacement useful for new inputs.

What would settle it

If the replaced network produces different classification decisions than the original ReLU network on a held-out test set drawn from the same distribution, the replacement has failed to preserve behavior.

Figures

Figures reproduced from arXiv: 2605.22237 by Rui Li, Weijie Miao, Wenyuan Wu.

**Figure 2.** Figure 2: HE top-1 accuracy versus the CKKS scaling factor [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗

read the original abstract

Fully homomorphic encryption (FHE) supports only additions and multiplications, so FHE-only neural-network inference typically replaces ReLU with polynomials fitted over empirical activation intervals. Such interval fitting often requires higher-degree polynomials to control activation error, incurring homomorphic evaluation costs, while classification is determined by the final logit decision. We revisit ReLU replacement from a decision-aware perspective: given a trained single-hidden-layer ReLU MLP and a specified calibration set, can an HE-friendly low-degree polynomial replace ReLU without retraining while preserving calibration-set decisions? We focus on quadratic replacement, the lowest-degree that retains a genuine per-unit nonlinearity. For calibration sets positive-margin separable in the lifted space, we formulate quadratic replacement as a linear separation problem, yielding necessary and sufficient conditions for calibration-lossless replacement and a constructive algorithm for the coefficients. When the positive-margin condition fails -- often because a few near-boundary or misclassified calibration samples bring the lifted hulls into contact -- we extend the same geometric framework via reduced convex hulls and Lagrangian-dual soft-margin relaxations. These cap the weight any single sample can carry, converting the problem into smaller convex quadratic programs that yield approximately feasible coefficients with high empirical agreement on calibration-set decisions. In particular, at the maximal weight cap $\mu=1$, the reduced-convex-hull relaxation reduces to standard convex-hull separation; the relaxation thus continuously extends the positive-margin exact theory. Under CKKS, the quadratic replacement matches plaintext top-1 accuracy on multiple benchmarks, running 3.7--4.1$\times$ faster than Remez-7 in the activation module and 1.18--1.68$\times$ faster end-to-end.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The geometric reformulation of quadratic ReLU replacement as lifted-space linear separation is the real novelty here, but the practical payoff still depends on how well calibration-set decisions carry over to test data.

read the letter

The paper's main move is to treat quadratic replacement as finding coefficients that separate positive and negative activation regions in a lifted space, giving necessary and sufficient conditions when the calibration points have positive margin. When the hulls touch, the reduced-convex-hull relaxation with weight cap mu turns the problem into convex QPs that still produce coefficients with high agreement on the calibration decisions. That framing and the continuous extension from exact separation to the soft-margin case are not standard in the interval-fitting literature the abstract contrasts against.

Referee Report

0 major / 3 minor

Summary. The manuscript presents a decision-aware approach to replacing ReLU activations with quadratic polynomials in single-hidden-layer MLPs for fully homomorphic encryption (FHE) inference. Given a trained model and calibration set, it formulates the problem as finding quadratic coefficients that preserve the decisions on the calibration set. For cases where the lifted-space calibration points are positive-margin separable, this is cast as a linear separation problem with necessary and sufficient conditions and a constructive algorithm. When separability fails, reduced convex hulls and soft-margin relaxations with a weight cap μ are employed to find approximate coefficients. Experiments under the CKKS scheme show that the resulting quadratics achieve the same top-1 accuracy as the plaintext model on benchmarks while providing speedups compared to higher-degree polynomial replacements.

Significance. The geometric formulation in the lifted space offers a clean, constructive method for calibration-lossless replacement that is independent of empirical fitting constants from prior work. The extension via reduced convex hulls provides a continuous family of relaxations controlled by μ. The reported 3.7--4.1× speedup in the activation module and 1.18--1.68× end-to-end are notable. The method's focus on decision preservation rather than uniform approximation is a useful perspective. The stress-test concern regarding generalization from calibration to test sets does not land, as the manuscript reports matching accuracy on the benchmarks.

minor comments (3)

[Abstract] Abstract: the specific benchmarks, models, and calibration-set sizes used for the accuracy and timing claims are not named; §4 should list them explicitly with dataset references.
[§3] The definition of the lifted feature map and the resulting quadratic form should be given as numbered equations in §3 before the separation formulation is stated.
Figure captions for the geometric illustrations should explicitly reference the value of μ used in each panel.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive and constructive review, which accurately captures the geometric formulation, the role of reduced convex hulls, and the reported speedups under CKKS. The recommendation of minor revision is noted; we will incorporate any editorial suggestions in the revised version.

Circularity Check

0 steps flagged

No significant circularity in geometric separation formulation

full rationale

The paper derives quadratic ReLU replacement by lifting activations to a space where decision preservation on a calibration set reduces to linear separability (or its convex-hull/soft-margin relaxations). This construction directly yields necessary-and-sufficient conditions and coefficients from the separation problem itself; no parameter is fitted on one subset and then renamed as a prediction on a related quantity, no self-citation supplies a load-bearing uniqueness theorem, and no ansatz is smuggled in. The weight cap μ is an explicit design parameter, not a hidden dependency. Empirical matching on benchmarks is presented as validation, not as part of the derivation chain. The method is therefore self-contained.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The claim rests on the geometric properties of the lifted activation space and on the modeling choice that decision preservation on a calibration set is the relevant success metric; no new physical entities are introduced.

free parameters (1)

mu
Weight cap on individual calibration samples in the reduced-convex-hull relaxation; set to 1 in the limiting case that recovers standard convex-hull separation.

axioms (2)

domain assumption The network is a single-hidden-layer ReLU MLP
The lifting and separation arguments are developed specifically for this architecture.
domain assumption Preserving decisions on the calibration set is the operative correctness criterion
The entire replacement procedure is defined with respect to this finite set rather than pointwise approximation error.

pith-pipeline@v0.9.0 · 5841 in / 1506 out tokens · 40953 ms · 2026-05-25T06:00:41.221397+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages · 2 internal anchors

[1]

On data banks and privacy homomorphisms,

R. L. Rivest, L. Adleman, and M. L. Dertouzos, “On data banks and privacy homomorphisms,” inFoundations of Secure Computation, 1978, pp. 169–179

work page 1978
[2]

A fully homomorphic encryption scheme,

C. Gentry, “A fully homomorphic encryption scheme,” Ph.D. dissertation, Stanford University, 2009

work page 2009
[3]

Homomorphic encryption for arithmetic of approximate numbers,

J. H. Cheon, A. Kim, M. Kim, and Y . Song, “Homomorphic encryption for arithmetic of approximate numbers,” inProc. ASIACRYPT, 2017, pp. 409–437

work page 2017
[4]

Machine learning classification over encrypted data,

R. Bost, R. A. Popa, S. Tu, and S. Goldwasser, “Machine learning classification over encrypted data,” inProc. NDSS, 2015

work page 2015
[5]

SecureML: A system for scalable privacy-preserving machine learning,

P. Mohassel and Y . Zhang, “SecureML: A system for scalable privacy-preserving machine learning,” inProc. IEEE S&P, 2017, pp. 19–38

work page 2017
[6]

Oblivious neural network predictions via MiniONN transformations,

J. Liu, M. Juuti, Y . Lu, and N. Asokan, “Oblivious neural network predictions via MiniONN transformations,” inProc. ACM CCS, 2017, pp. 619–631

work page 2017
[7]

CryptoNets: Applying neural networks to encrypted data with high throughput and accuracy,

R. Gilad-Bachrach, N. Dowlin, K. Laine, K. Lauter, M. Naehrig, and J. Wernsing, “CryptoNets: Applying neural networks to encrypted data with high throughput and accuracy,” inProc. ICML, PMLR, vol. 48, pp. 201–210, 2016

work page 2016
[8]

Low latency privacy preserving inference,

A. Brutzkus, R. Gilad-Bachrach, and O. Elisha, “Low latency privacy preserving inference,” inProc. ICML, 2019, pp. 812–821

work page 2019
[9]

CHET: an optimizing compiler for fully-homomorphic neural- network inferencing,

R. Dathathriet al., “CHET: an optimizing compiler for fully-homomorphic neural- network inferencing,” inProc. ACM PLDI, 2019, pp. 142–156

work page 2019
[10]

EV A: An encrypted vector arithmetic language and compiler for efficient homomorphic computation,

R. Dathathriet al., “EV A: An encrypted vector arithmetic language and compiler for efficient homomorphic computation,” inProc. ACM PLDI, 2020, pp. 546–561

work page 2020
[11]

nGraph-HE2: A high-throughput framework for neural network inference on encrypted data,

F. Boemer, A. Costache, R. Cammarota, and C. Wierzynski, “nGraph-HE2: A high-throughput framework for neural network inference on encrypted data,” in Proc. WAHC, 2019, pp. 45–56

work page 2019
[12]

GAZELLE: A low latency framework for secure neural network inference,

C. Juvekar, V . Vaikuntanathan, and A. Chandrakasan, “GAZELLE: A low latency framework for secure neural network inference,” inProc. USENIX Security, 2018, pp. 1651–1669

work page 2018
[13]

DELPHI: A cryptographic inference service for neural networks,

P. Mishraet al., “DELPHI: A cryptographic inference service for neural networks,” inProc. USENIX Security, 2020, pp. 2505–2522

work page 2020
[14]

CrypTFlow2: Practical 2-party secure inference,

D. Ratheeet al., “CrypTFlow2: Practical 2-party secure inference,” inProc. ACM CCS, 2020, pp. 325–342

work page 2020
[15]

XONN: XNOR-based oblivious deep neural network inference,

M. S. Riaziet al., “XONN: XNOR-based oblivious deep neural network inference,” inProc. USENIX Security, 2019, pp. 1501–1518

work page 2019
[16]

Cheetah: Lean and fast secure two-party deep neural network inference,

Z. Huanget al., “Cheetah: Lean and fast secure two-party deep neural network inference,” inProc. USENIX Security, 2022, pp. 809–826

work page 2022
[17]

L. N. Trefethen,Approximation Theory and Approximation Practice. SIAM, 2013

work page 2013
[18]

Boyd and L

S. Boyd and L. Vandenberghe,Convex Optimization. Cambridge Univ. Press, 2004

work page 2004
[19]

CryptoDL: Deep Neural Networks over Encrypted Data

E. Hesamifard, H. Takabi, and M. Ghasemi, “CryptoDL: Deep neural networks over encrypted data,” arXiv:1711.05189, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[20]

Precise approximation of convolutional neural networks for homomorphically encrypted data,

J. Lee, E. Lee, J.-W. Lee, Y . Kim, Y .-S. Kim, and J.-S. No, “Precise approximation of convolutional neural networks for homomorphically encrypted data,”IEEE Access, vol. 11, pp. 62062–62076, 2023, doi: 10.1109/ACCESS.2023.3287564

work page doi:10.1109/access.2023.3287564 2023
[21]

Optimized layerwise approximation for efficient private inference on fully homomorphic encryption,

J. Lee, E. Lee, Y .-S. Kim, Y . Lee, J.-W. Lee, Y . Kim, and J.-S. No, “Optimized layerwise approximation for efficient private inference on fully homomorphic encryption,” arXiv:2310.10349v4, 2025

work page arXiv 2025
[22]

SAFENet: A secure, accurate, and fast neural network inference,

Q. Lou, Y . Shen, H. Jin, and L. Jiang, “SAFENet: A secure, accurate, and fast neural network inference,” inProc. ICLR, 2021

work page 2021
[23]

AutoFHE: Automated adaption of CNNs for efficient evaluation over FHE,

W. Ao and V . N. Boddeti, “AutoFHE: Automated adaption of CNNs for efficient evaluation over FHE,” inProc. USENIX Security, 2024, pp. 2173–2190

work page 2024
[24]

Explaining and harnessing adversarial examples,

I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” inProc. ICLR, 2015

work page 2015
[25]

Support-vector networks,

C. Cortes and V . Vapnik, “Support-vector networks,”Machine Learning, vol. 20, no. 3, pp. 273–297, 1995

work page 1995
[26]

Duality and geometry in SVM classifiers,

K. P. Bennett and E. J. Bredensteiner, “Duality and geometry in SVM classifiers,” inProc. ICML, 2000, pp. 57–64

work page 2000
[27]

MLP-Mixer: An all-MLP architecture for vision,

I. O. Tolstikhinet al., “MLP-Mixer: An all-MLP architecture for vision,” inProc. NeurIPS, 2021

work page 2021
[28]

ResMLP: Feedforward networks for image classification with data-efficient training,

H. Touvronet al., “ResMLP: Feedforward networks for image classification with data-efficient training,”IEEE TPAMI, vol. 45, no. 4, pp. 5314–5321, 2023

work page 2023
[29]

Graph-less neural networks: Teaching old MLPs new tricks via distillation,

S. Zhang, Y . Liu, Y . Sun, and N. Shah, “Graph-less neural networks: Teaching old MLPs new tricks via distillation,” inProc. ICLR, 2022

work page 2022
[30]

DINOv2: Learning robust visual features without supervision,

M. Oquabet al., “DINOv2: Learning robust visual features without supervision,” Trans. Mach. Learn. Res. (TMLR), 2024

work page 2024
[31]

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

Y . Zhang, M. Li, D. Long, X. Zhang, H. Lin, B. Yang, P. Xie, A. Yang, D. Liu, J. Lin, F. Huang, and J. Zhou, “Qwen3 Embedding: Advancing text embedding and reranking through foundation models,” arXiv:2506.05176v3, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[1] [1]

On data banks and privacy homomorphisms,

R. L. Rivest, L. Adleman, and M. L. Dertouzos, “On data banks and privacy homomorphisms,” inFoundations of Secure Computation, 1978, pp. 169–179

work page 1978

[2] [2]

A fully homomorphic encryption scheme,

C. Gentry, “A fully homomorphic encryption scheme,” Ph.D. dissertation, Stanford University, 2009

work page 2009

[3] [3]

Homomorphic encryption for arithmetic of approximate numbers,

J. H. Cheon, A. Kim, M. Kim, and Y . Song, “Homomorphic encryption for arithmetic of approximate numbers,” inProc. ASIACRYPT, 2017, pp. 409–437

work page 2017

[4] [4]

Machine learning classification over encrypted data,

R. Bost, R. A. Popa, S. Tu, and S. Goldwasser, “Machine learning classification over encrypted data,” inProc. NDSS, 2015

work page 2015

[5] [5]

SecureML: A system for scalable privacy-preserving machine learning,

P. Mohassel and Y . Zhang, “SecureML: A system for scalable privacy-preserving machine learning,” inProc. IEEE S&P, 2017, pp. 19–38

work page 2017

[6] [6]

Oblivious neural network predictions via MiniONN transformations,

J. Liu, M. Juuti, Y . Lu, and N. Asokan, “Oblivious neural network predictions via MiniONN transformations,” inProc. ACM CCS, 2017, pp. 619–631

work page 2017

[7] [7]

CryptoNets: Applying neural networks to encrypted data with high throughput and accuracy,

R. Gilad-Bachrach, N. Dowlin, K. Laine, K. Lauter, M. Naehrig, and J. Wernsing, “CryptoNets: Applying neural networks to encrypted data with high throughput and accuracy,” inProc. ICML, PMLR, vol. 48, pp. 201–210, 2016

work page 2016

[8] [8]

Low latency privacy preserving inference,

A. Brutzkus, R. Gilad-Bachrach, and O. Elisha, “Low latency privacy preserving inference,” inProc. ICML, 2019, pp. 812–821

work page 2019

[9] [9]

CHET: an optimizing compiler for fully-homomorphic neural- network inferencing,

R. Dathathriet al., “CHET: an optimizing compiler for fully-homomorphic neural- network inferencing,” inProc. ACM PLDI, 2019, pp. 142–156

work page 2019

[10] [10]

EV A: An encrypted vector arithmetic language and compiler for efficient homomorphic computation,

R. Dathathriet al., “EV A: An encrypted vector arithmetic language and compiler for efficient homomorphic computation,” inProc. ACM PLDI, 2020, pp. 546–561

work page 2020

[11] [11]

nGraph-HE2: A high-throughput framework for neural network inference on encrypted data,

F. Boemer, A. Costache, R. Cammarota, and C. Wierzynski, “nGraph-HE2: A high-throughput framework for neural network inference on encrypted data,” in Proc. WAHC, 2019, pp. 45–56

work page 2019

[12] [12]

GAZELLE: A low latency framework for secure neural network inference,

C. Juvekar, V . Vaikuntanathan, and A. Chandrakasan, “GAZELLE: A low latency framework for secure neural network inference,” inProc. USENIX Security, 2018, pp. 1651–1669

work page 2018

[13] [13]

DELPHI: A cryptographic inference service for neural networks,

P. Mishraet al., “DELPHI: A cryptographic inference service for neural networks,” inProc. USENIX Security, 2020, pp. 2505–2522

work page 2020

[14] [14]

CrypTFlow2: Practical 2-party secure inference,

D. Ratheeet al., “CrypTFlow2: Practical 2-party secure inference,” inProc. ACM CCS, 2020, pp. 325–342

work page 2020

[15] [15]

XONN: XNOR-based oblivious deep neural network inference,

M. S. Riaziet al., “XONN: XNOR-based oblivious deep neural network inference,” inProc. USENIX Security, 2019, pp. 1501–1518

work page 2019

[16] [16]

Cheetah: Lean and fast secure two-party deep neural network inference,

Z. Huanget al., “Cheetah: Lean and fast secure two-party deep neural network inference,” inProc. USENIX Security, 2022, pp. 809–826

work page 2022

[17] [17]

L. N. Trefethen,Approximation Theory and Approximation Practice. SIAM, 2013

work page 2013

[18] [18]

Boyd and L

S. Boyd and L. Vandenberghe,Convex Optimization. Cambridge Univ. Press, 2004

work page 2004

[19] [19]

CryptoDL: Deep Neural Networks over Encrypted Data

E. Hesamifard, H. Takabi, and M. Ghasemi, “CryptoDL: Deep neural networks over encrypted data,” arXiv:1711.05189, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[20] [20]

Precise approximation of convolutional neural networks for homomorphically encrypted data,

J. Lee, E. Lee, J.-W. Lee, Y . Kim, Y .-S. Kim, and J.-S. No, “Precise approximation of convolutional neural networks for homomorphically encrypted data,”IEEE Access, vol. 11, pp. 62062–62076, 2023, doi: 10.1109/ACCESS.2023.3287564

work page doi:10.1109/access.2023.3287564 2023

[21] [21]

Optimized layerwise approximation for efficient private inference on fully homomorphic encryption,

J. Lee, E. Lee, Y .-S. Kim, Y . Lee, J.-W. Lee, Y . Kim, and J.-S. No, “Optimized layerwise approximation for efficient private inference on fully homomorphic encryption,” arXiv:2310.10349v4, 2025

work page arXiv 2025

[22] [22]

SAFENet: A secure, accurate, and fast neural network inference,

Q. Lou, Y . Shen, H. Jin, and L. Jiang, “SAFENet: A secure, accurate, and fast neural network inference,” inProc. ICLR, 2021

work page 2021

[23] [23]

AutoFHE: Automated adaption of CNNs for efficient evaluation over FHE,

W. Ao and V . N. Boddeti, “AutoFHE: Automated adaption of CNNs for efficient evaluation over FHE,” inProc. USENIX Security, 2024, pp. 2173–2190

work page 2024

[24] [24]

Explaining and harnessing adversarial examples,

I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” inProc. ICLR, 2015

work page 2015

[25] [25]

Support-vector networks,

C. Cortes and V . Vapnik, “Support-vector networks,”Machine Learning, vol. 20, no. 3, pp. 273–297, 1995

work page 1995

[26] [26]

Duality and geometry in SVM classifiers,

K. P. Bennett and E. J. Bredensteiner, “Duality and geometry in SVM classifiers,” inProc. ICML, 2000, pp. 57–64

work page 2000

[27] [27]

MLP-Mixer: An all-MLP architecture for vision,

I. O. Tolstikhinet al., “MLP-Mixer: An all-MLP architecture for vision,” inProc. NeurIPS, 2021

work page 2021

[28] [28]

ResMLP: Feedforward networks for image classification with data-efficient training,

H. Touvronet al., “ResMLP: Feedforward networks for image classification with data-efficient training,”IEEE TPAMI, vol. 45, no. 4, pp. 5314–5321, 2023

work page 2023

[29] [29]

Graph-less neural networks: Teaching old MLPs new tricks via distillation,

S. Zhang, Y . Liu, Y . Sun, and N. Shah, “Graph-less neural networks: Teaching old MLPs new tricks via distillation,” inProc. ICLR, 2022

work page 2022

[30] [30]

DINOv2: Learning robust visual features without supervision,

M. Oquabet al., “DINOv2: Learning robust visual features without supervision,” Trans. Mach. Learn. Res. (TMLR), 2024

work page 2024

[31] [31]

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

Y . Zhang, M. Li, D. Long, X. Zhang, H. Lin, B. Yang, P. Xie, A. Yang, D. Liu, J. Lin, F. Huang, and J. Zhou, “Qwen3 Embedding: Advancing text embedding and reranking through foundation models,” arXiv:2506.05176v3, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025