pith. sign in
Pith Number

pith:7IXCF2WF

pith:2022:7IXCF2WFBBTORRCEUPMRCI32ES
not attested not anchored not stored refs resolved

InCoder: A Generative Model for Code Infilling and Synthesis

Armen Aghajanyan, Daniel Fried, Eric Wallace, Freda Shi, Jessy Lin, Luke Zettlemoyer, Mike Lewis, Ruiqi Zhong, Sida Wang, Wen-tau Yih

InCoder is a single generative model that performs both left-to-right code synthesis and zero-shot infilling of masked regions using bidirectional context.

arxiv:2204.05999 v3 · 2022-04-12 · cs.SE · cs.CL · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{7IXCF2WFBBTORRCEUPMRCI32ES}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Our model is the first generative model that is able to directly perform zero-shot code infilling, which we evaluate on challenging tasks such as type inference, comment generation, and variable re-naming.

C2weakest assumption

That randomly masking and appending code regions during training produces a model whose infilling behavior generalizes to realistic editing scenarios without task-specific fine-tuning or data leakage from the test distributions.

C3one line summary

InCoder is the first generative model to directly perform zero-shot code infilling via bidirectional context from a masked-then-appended training scheme, matching left-to-right models on synthesis while improving on type inference, comment generation, and variable renaming.

References

38 extracted · 38 resolved · 13 Pith anchors

[1] Cm3: A causal masked multimodal model of the internet
[2] V ., Du, J., Iyer, S., Pasunuru, R., et al
[3] Program Synthesis with Large Language Models · arXiv:2108.07732
[4] Efficient training of language models to fill in the middle
[5] AutoPandas: neural- backed generators for program synthesis 2023

Formal links

2 machine-checked theorem links

Cited by

25 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:49.395076Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

fa2e22eac50866e8c444a3d911237a24b1ee6dbc5789e025bddbf28d98cc43ad

Aliases

arxiv: 2204.05999 · arxiv_version: 2204.05999v3 · doi: 10.48550/arxiv.2204.05999 · pith_short_12: 7IXCF2WFBBTO · pith_short_16: 7IXCF2WFBBTORRCE · pith_short_8: 7IXCF2WF
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/7IXCF2WFBBTORRCEUPMRCI32ES \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: fa2e22eac50866e8c444a3d911237a24b1ee6dbc5789e025bddbf28d98cc43ad
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "2cc00f957c4140489f550860c56efe9b4b81d0daf5555f1e30c50f11d894d679",
    "cross_cats_sorted": [
      "cs.CL",
      "cs.LG"
    ],
    "license": "http://creativecommons.org/licenses/by-nc-sa/4.0/",
    "primary_cat": "cs.SE",
    "submitted_at": "2022-04-12T16:25:26Z",
    "title_canon_sha256": "e0ee2b0588f2b16181eaf033db6715dd9f9f025488c5b7fc3381446517c0c296"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2204.05999",
    "kind": "arxiv",
    "version": 3
  }
}