pith. sign in
Pith Number

pith:SJSH7EMD

pith:2026:SJSH7EMD4XIPTJ5USCEJ3R6SPD
not attested not anchored not stored refs resolved

From Local Matches to Global Masks: Template-Guided Instance Detection and Segmentation in Open-World Scenes

Jikai Wang, Qifan Zhang, Sai Haneesh Allu, Yangxiao Lu, Yu Xiang

L2G-Det detects and segments novel object instances in cluttered scenes by matching dense patches from templates to prompt an augmented SAM model.

arxiv:2603.03577 v2 · 2026-03-03 · cs.CV · cs.RO

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{SJSH7EMD4XIPTJ5USCEJ3R6SPD}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

L2G-Det bypasses explicit object proposals by leveraging dense patch-level matching between templates and the query image. Locally matched patches generate candidate points, which are refined through a candidate selection module to suppress false positives. The filtered points are then used to prompt an augmented Segment Anything Model (SAM) with instance-specific object tokens, enabling reliable reconstruction of complete instance masks.

C2weakest assumption

That dense patch-level matching will produce sufficiently accurate candidate points in cluttered scenes and that the candidate selection module will reliably suppress false positives without removing true matches, allowing the augmented SAM to reconstruct complete masks.

C3one line summary

L2G-Det detects and segments novel object instances in open scenes by using local template patch matches to generate points that prompt an augmented SAM for global masks.

References

51 extracted · 51 resolved · 10 Pith anchors

[1] A modular robotic system for autonomous ex- ploration and semantic updating in large-scale indoor en- vironments, 2025 2025
[2] Target driven instance detection.arXiv preprint arXiv:1803.04610, 2018 2018
[3] Surf: Speeded up robust features 2006
[4] Perception Encoder: The best visual embeddings are not at the output of the network 2025 · arXiv:2504.13181
[5] Bidirectional attention network for monocular depth estimation 2021
Receipt and verification
First computed 2026-05-17T23:39:15.873601Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

92647f9183e5d0f9a7b490889dc7d278d0acb87a8e872f6de0e1636e098290a8

Aliases

arxiv: 2603.03577 · arxiv_version: 2603.03577v2 · doi: 10.48550/arxiv.2603.03577 · pith_short_12: SJSH7EMD4XIP · pith_short_16: SJSH7EMD4XIPTJ5U · pith_short_8: SJSH7EMD
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/SJSH7EMD4XIPTJ5USCEJ3R6SPD \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 92647f9183e5d0f9a7b490889dc7d278d0acb87a8e872f6de0e1636e098290a8
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "720fa3578ff9b9c4880bf1757dd510bc21630ac8e1f78cddd893220cac73f52a",
    "cross_cats_sorted": [
      "cs.RO"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2026-03-03T23:11:17Z",
    "title_canon_sha256": "3fbbb7823633c74ab00d462b5bb9d90efe5afece4d28661a1a164f37b92b3cdd"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2603.03577",
    "kind": "arxiv",
    "version": 2
  }
}