pith. sign in
Pith Number

pith:WTMT4C4F

pith:2022:WTMT4C4F4AV4EMNMGBZJ5M7KEO
not attested not anchored not stored refs resolved

Language-driven Semantic Segmentation

Boyi Li, Kilian Q. Weinberger, Ren\'e Ranftl, Serge Belongie, Vladlen Koltun

LSeg aligns per-pixel image embeddings contrastively with text label embeddings to enable zero-shot semantic segmentation.

arxiv:2201.03546 v2 · 2022-01-10 · cs.CV · cs.CL · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{WTMT4C4F4AV4EMNMGBZJ5M7KEO}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

We demonstrate that our approach achieves highly competitive zero-shot performance compared to existing zero- and few-shot semantic segmentation methods, and even matches the accuracy of traditional segmentation algorithms when a fixed label set is provided.

C2weakest assumption

That the contrastive alignment learned on seen classes will transfer to arbitrary unseen text labels without retraining or additional samples, relying on the semantic structure already present in the pre-trained text encoder.

C3one line summary

LSeg achieves competitive zero-shot semantic segmentation by contrastively aligning dense pixel embeddings from a transformer with text embeddings of class labels.

References

13 extracted · 13 resolved · 4 Pith anchors

[1] Rethinking Atrous Convolution for Semantic Image Segmentation · arXiv:1706.05587
[2] Imagenet: A large-scale hierarchical image database 2009
[3] Recent advances in open set recognition: A survey 2022
[4] Open-vocabulary Object Detection via Vision and Language Knowledge Distillation · arXiv:2104.13921
[5] Few-shot open-set recognition using meta-learning 2013

Formal links

3 machine-checked theorem links

Cited by

30 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:15.264219Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

b4d93e0b85e02bc231ac30729eb3ea23b1eea10f3756730feba603c40ea6fdb5

Aliases

arxiv: 2201.03546 · arxiv_version: 2201.03546v2 · doi: 10.48550/arxiv.2201.03546 · pith_short_12: WTMT4C4F4AV4 · pith_short_16: WTMT4C4F4AV4EMNM · pith_short_8: WTMT4C4F
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/WTMT4C4F4AV4EMNMGBZJ5M7KEO \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: b4d93e0b85e02bc231ac30729eb3ea23b1eea10f3756730feba603c40ea6fdb5
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "dd8a4b1bc995781d61149e8a1705ae427fa55a3819a7bcb530e6a8e2812d504c",
    "cross_cats_sorted": [
      "cs.CL",
      "cs.LG"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2022-01-10T18:59:10Z",
    "title_canon_sha256": "82551c1a4c2bacd13df1784abb5b15249ffe375494248a240a31a918cd10e528"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2201.03546",
    "kind": "arxiv",
    "version": 2
  }
}