pith. sign in
Pith Number

pith:EBPXMTFH

pith:2026:EBPXMTFHOEFDNDEJZS4U47RQPK
not attested not anchored not stored refs resolved

Decoupling Vector Data and Index Storage for Space Efficiency

Di Wu, Juncheng Zhang, Patrick P. C. Lee, Rui Yang, Yanjing Ren, Yuanming Ren

COMPASS decouples vector data from index metadata to compress each separately, cutting storage by up to 58.7% while keeping search and update performance competitive.

arxiv:2604.09173 v2 · 2026-04-10 · cs.DB · cs.OS

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{EBPXMTFHOEFDNDEJZS4U47RQPK}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

COMPASS reduces storage space by up to 58.7%, while delivering improved or competitive search and update performance compared to state-of-the-art disk-resident graph ANNS systems.

C2weakest assumption

The assumption that vector data and auxiliary index metadata possess sufficiently distinct compressibility characteristics that can be exploited independently after decoupling without introducing unacceptable overhead in search or update paths.

C3one line summary

COMPASS decouples vector data and index storage in disk-resident graph ANNS systems to enable component-specific lossless compression, reducing space by up to 58.7% with improved or competitive performance.

References

55 extracted · 55 resolved · 2 Pith anchors

[1] Apache. Cassandra. https://cassandra.apache. org/, 2025 2025
[2] Language models are few-shot learners.Proc 2020
[3] Zhichao Cao, Siying Dong, Sagar Vemuri, and David H. C. Du. Characterizing, modeling, and benchmarking RocksDB key-value workloads at Facebook. InProc. of USENIX FAST, 2020 2020
[4] Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E Gruber. Bigtable: A distributed storage system for structured da 2006
[5] Sptag: A li- brary for fast approximate nearest neighbor search 2018

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-20T00:01:41.183984Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

205f764ca7710a368c89ccb94e7e307aa5eb53301296fb98246614321020601f

Aliases

arxiv: 2604.09173 · arxiv_version: 2604.09173v2 · doi: 10.48550/arxiv.2604.09173 · pith_short_12: EBPXMTFHOEFD · pith_short_16: EBPXMTFHOEFDNDEJ · pith_short_8: EBPXMTFH
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/EBPXMTFHOEFDNDEJZS4U47RQPK \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 205f764ca7710a368c89ccb94e7e307aa5eb53301296fb98246614321020601f
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "59be6a55d422d9655fb28b3d83bcda2ad32cd4b00e36e2300851c48bcf5e3d5c",
    "cross_cats_sorted": [
      "cs.OS"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.DB",
    "submitted_at": "2026-04-10T09:58:17Z",
    "title_canon_sha256": "7a3086fbafc1e369a8721945f04bd80b8c59037f4396ebc982b7b72e62623c01"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2604.09173",
    "kind": "arxiv",
    "version": 2
  }
}