pith. sign in
Pith Number

pith:JT77CAYK

pith:2026:JT77CAYKH3OLFHGYQ24WA5OJCF
not attested not anchored not stored refs resolved

Ascend-RaBitQ: Heterogeneous NPU-CPU Acceleration of Billion-Scale Similarity Search with 1-bit Quantization

Baolong Cui, Chao Zhan, Chuyue Ye, Fujun He, Hao Yi, Huaxiang Cai, Jie Xiang, Pengfei Zheng, Wenru Yan, Xiabing Li, Yuhang Gai, Yunfei Du, Zetao Lv, Zigang Zhang, Ziyang Zhang

Decoupling NPU coarse ranking on 1-bit vectors from CPU fine re-ranking accelerates billion-scale vector similarity search by up to 100 times.

arxiv:2605.16007 v1 · 2026-05-15 · cs.IR

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{JT77CAYKH3OLFHGYQ24WA5OJCF}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Ascend-RaBitQ achieves 3.0* to 62.8* faster index construction than the CPU baseline, up to 4.6* throughput improvement over the fastest CPU IVF-RaBitQ implementation, and over 100* over the mathematically equivalent CPU baseline, while demonstrating encouraging scalability on distributed multi-NPU systems.

C2weakest assumption

The assumption that the three-stage heterogeneous pipeline (NPU coarse ranking on 1-bit vectors, on-device AI CPU Top-k, host CPU fine re-ranking) preserves accuracy without post-hoc adjustments while the four NPU-native optimizations (fused AIC-AIV operators, computation restructuring, block-level load balancing, intra-NPU pipeline) deliver the reported speedups on real hardware.

C3one line summary

Ascend-RaBitQ is the first heterogeneous NPU-CPU optimized IVF-RaBitQ system for billion-scale vector search that decouples coarse ranking on NPU from fine ranking on CPU to leverage optimal hardware per stage.

References

49 extracted · 49 resolved · 1 Pith anchors

[1] Philip Adams, Menghao Li, Shi Zhang, Li Tan, Qi Chen, Mingqin Li, Zengzhong Li, Knut Risvik, and Harsha Vardhan Simhadri. 2025. Distributedann: Efficient scaling of a single diskann graph across thous 2025
[2] Fabien André, Anne-Marie Kermarrec, and Nicolas Le Scouarnec. 2016. Cache locality is not enough: High-performance nearest neighbor search with product quantization fast scan. In42nd International Con 2016
[3] Fabien André, Anne-Marie Kermarrec, and Nicolas Le Scouarnec. 2017. Acceler- ated nearest neighbor search with quick adc. InProceedings of the 2017 ACM on International Conference on Multimedia Retrie 2017
[4] Artem Babenko and Victor Lempitsky. 2014. The inverted multi-index.IEEE transactions on pattern analysis and machine intelligence37, 6 (2014), 1247–1260 2014
[5] Oren Barkan and Noam Koenigstein. 2016. Item2vec: neural item embedding for collaborative filtering. In2016 IEEE 26th international workshop on machine learning for signal processing (MLSP). IEEE, 1–6 2016
Receipt and verification
First computed 2026-05-20T00:01:48.745612Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

4cfff1030a3edcb29cd886b96075c91154c57a01ef77386810c207b6ee934582

Aliases

arxiv: 2605.16007 · arxiv_version: 2605.16007v1 · doi: 10.48550/arxiv.2605.16007 · pith_short_12: JT77CAYKH3OL · pith_short_16: JT77CAYKH3OLFHGY · pith_short_8: JT77CAYK
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/JT77CAYKH3OLFHGYQ24WA5OJCF \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 4cfff1030a3edcb29cd886b96075c91154c57a01ef77386810c207b6ee934582
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "f8610a5af4ff2ea01717dc5b0c3ff4ff8e9a0aafa0da5b022e0f8d631923c3cb",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.IR",
    "submitted_at": "2026-05-15T14:37:18Z",
    "title_canon_sha256": "78bca298d79182165e73061ecf3f8d5e3449c4ba09a71f0dd81ec1f117cfcbfa"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.16007",
    "kind": "arxiv",
    "version": 1
  }
}