pith:YKQZDIQ3
Atoms as Language: VQ-Atom: Semantic Discretization for Molecular Representation Learning
Vector quantization on atom embeddings yields discrete tokens for chemical contexts that boost protein-ligand prediction.
arxiv:2605.16823 v1 · 2026-05-16 · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{YKQZDIQ3HBRUY4IARCBELST6B3}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Experimental results show that VQ-Atom consistently improves predictive performance compared to conventional tokenization approaches in protein-ligand interaction prediction under a protein-cold split setting without relying on 3D structural information.
That the codebook entries learned via vector quantization on GNN embeddings correspond to chemically meaningful atomic contexts that are relevant to the downstream prediction task and generalize beyond the training distribution.
VQ-Atom discretizes continuous GNN atom embeddings into chemically meaningful discrete tokens via vector quantization to improve molecular language modeling for downstream chemistry tasks.
References
Formal links
Receipt and verification
| First computed | 2026-05-20T00:03:24.471740Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
c2a191a21b38634c7100888245ca7e0ed1ca381a5d26119b6188b72df2a4ab11
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/YKQZDIQ3HBRUY4IARCBELST6B3 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: c2a191a21b38634c7100888245ca7e0ed1ca381a5d26119b6188b72df2a4ab11
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "fe146792818b3478d278f05cbcc09f4bec68d135d66cbb89666e7b1da47364b3",
"cross_cats_sorted": [],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.LG",
"submitted_at": "2026-05-16T05:45:54Z",
"title_canon_sha256": "75dfe0046e564b7eb28ab56df1838b9d494fe63fdbc1e7112fa72ddff8666343"
},
"schema_version": "1.0",
"source": {
"id": "2605.16823",
"kind": "arxiv",
"version": 1
}
}