{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2026:UL5I747MG5WKVSCFJTWH5AMHKL","short_pith_number":"pith:UL5I747M","schema_version":"1.0","canonical_sha256":"a2fa8ff3ec376caac8454cec7e818752f999cc78786ba053455d9c017312e7f0","source":{"kind":"arxiv","id":"2605.00768","version":2},"attestation_state":"computed","paper":{"title":"Characterizing the Expressivity of Local Attention in Transformers","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Local attention introduces a second temporal operator in transformers, strictly enlarging the class of recognizable regular languages beyond what global attention alone achieves.","cross_cats":[],"primary_cat":"cs.CL","authors_text":"Jiaoda Li, Ryan Cotterell","submitted_at":"2026-05-01T16:30:52Z","abstract_excerpt":"The transformer is the most popular neural architecture for language modeling. The cornerstone of the transformer is its global attention mechanism, which lets the model aggregate information from all preceding tokens before generating the next token. One common variant of attention is called local attention, which restricts each token to aggregating information from a bounded window of predecessors, reducing the quadratic cost of global attention to linear. Although this restriction is usually motivated by efficiency, it has also been found to improve model quality, a phenomenon that has so f"},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":false,"formal_links_present":false},"canonical_record":{"source":{"id":"2605.00768","kind":"arxiv","version":2},"metadata":{"license":"http://creativecommons.org/licenses/by/4.0/","primary_cat":"cs.CL","submitted_at":"2026-05-01T16:30:52Z","cross_cats_sorted":[],"title_canon_sha256":"fd25aa791c18a9d06471ccd3da2c9cb720ff875a1df8374eead3b427a5497ee1","abstract_canon_sha256":"4fc5a9bb3390f86196108506be9e6053a1f29b72ee0124d47334fe10e4e9de8d"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-20T01:05:15.037363Z","signature_b64":"MeTI1SHcPka1mWzqGRwRkCv1A+F2jzBEo530YZkEoMEyL1MVzymWaqV+Z9qDrUlR0j8aR9OXQPQ1CkG1BrfLDw==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"a2fa8ff3ec376caac8454cec7e818752f999cc78786ba053455d9c017312e7f0","last_reissued_at":"2026-05-20T01:05:15.036764Z","signature_status":"signed_v1","first_computed_at":"2026-05-20T01:05:15.036764Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"Characterizing the Expressivity of Local Attention in Transformers","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Local attention introduces a second temporal operator in transformers, strictly enlarging the class of recognizable regular languages beyond what global attention alone achieves.","cross_cats":[],"primary_cat":"cs.CL","authors_text":"Jiaoda Li, Ryan Cotterell","submitted_at":"2026-05-01T16:30:52Z","abstract_excerpt":"The transformer is the most popular neural architecture for language modeling. The cornerstone of the transformer is its global attention mechanism, which lets the model aggregate information from all preceding tokens before generating the next token. One common variant of attention is called local attention, which restricts each token to aggregating information from a bounded window of predecessors, reducing the quadratic cost of global attention to linear. Although this restriction is usually motivated by efficiency, it has also been found to improve model quality, a phenomenon that has so f"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"We additionally prove that adding local attention introduces a second temporal operator, strictly enlarging the class of recognizable regular languages. Moreover, global and local attention are expressively complementary: neither subsumes the other, and combining them yields the richest fragment.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The established correspondence between fixed-precision global-attention transformers and a single-past-operator fragment of linear temporal logic extends cleanly to local attention without additional restrictions that would limit the result to toy models rather than practical transformers.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Local attention strictly enlarges the class of regular languages recognizable by fixed-precision transformers by adding a second past operator in linear temporal logic, with global and local attention being expressively complementary.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Local attention introduces a second temporal operator in transformers, strictly enlarging the class of recognizable regular languages beyond what global attention alone achieves.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"e4bc3a27c366a6dbd33826b8ac8f26ca13d06d4a635a9445fb6986b4621b0598"},"source":{"id":"2605.00768","kind":"arxiv","version":2},"verdict":{"id":"8faf7e6d-8861-4ff6-a5fe-98d70ab23ebe","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-09T18:57:48.171664Z","strongest_claim":"We additionally prove that adding local attention introduces a second temporal operator, strictly enlarging the class of recognizable regular languages. Moreover, global and local attention are expressively complementary: neither subsumes the other, and combining them yields the richest fragment.","one_line_summary":"Local attention strictly enlarges the class of regular languages recognizable by fixed-precision transformers by adding a second past operator in linear temporal logic, with global and local attention being expressively complementary.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The established correspondence between fixed-precision global-attention transformers and a single-past-operator fragment of linear temporal logic extends cleanly to local attention without additional restrictions that would limit the result to toy models rather than practical transformers.","pith_extraction_headline":"Local attention introduces a second temporal operator in transformers, strictly enlarging the class of recognizable regular languages beyond what global attention alone achieves."},"integrity":{"clean":false,"summary":{"advisory":1,"critical":0,"by_detector":{"doi_compliance":{"total":1,"advisory":1,"critical":0,"informational":0}},"informational":0},"endpoint":"/pith/2605.00768/integrity.json","findings":[{"note":"DOI in the printed bibliography is fragmented by whitespace or line breaks. A longer candidate (10.5555/3173440.3173443Syntactic) was visible in the surrounding text but could not be confirmed against doi.org as printed.","detector":"doi_compliance","severity":"advisory","ref_index":5,"audited_at":"2026-05-19T17:51:05.338050Z","detected_doi":"10.5555/3173440.3173443Syntactic","finding_type":"recoverable_identifier","verdict_class":"incontrovertible","detected_arxiv_id":null}],"available":true,"detectors_run":[{"name":"doi_compliance","ran_at":"2026-05-19T17:51:05.338050Z","status":"completed","version":"1.0.0","findings_count":1}],"snapshot_sha256":"a44ce95e3375d29e79f6d6e6e91a0f1f8fe0c2e68007dd6591ab40944f9a9cbc"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"2605.00768","created_at":"2026-05-20T01:05:15.036863+00:00"},{"alias_kind":"arxiv_version","alias_value":"2605.00768v2","created_at":"2026-05-20T01:05:15.036863+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2605.00768","created_at":"2026-05-20T01:05:15.036863+00:00"},{"alias_kind":"pith_short_12","alias_value":"UL5I747MG5WK","created_at":"2026-05-20T01:05:15.036863+00:00"},{"alias_kind":"pith_short_16","alias_value":"UL5I747MG5WKVSCF","created_at":"2026-05-20T01:05:15.036863+00:00"},{"alias_kind":"pith_short_8","alias_value":"UL5I747M","created_at":"2026-05-20T01:05:15.036863+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":0,"internal_anchor_count":0,"sample":[]},"formal_canon":{"evidence_count":0,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/UL5I747MG5WKVSCFJTWH5AMHKL","json":"https://pith.science/pith/UL5I747MG5WKVSCFJTWH5AMHKL.json","graph_json":"https://pith.science/api/pith-number/UL5I747MG5WKVSCFJTWH5AMHKL/graph.json","events_json":"https://pith.science/api/pith-number/UL5I747MG5WKVSCFJTWH5AMHKL/events.json","paper":"https://pith.science/paper/UL5I747M"},"agent_actions":{"view_html":"https://pith.science/pith/UL5I747MG5WKVSCFJTWH5AMHKL","download_json":"https://pith.science/pith/UL5I747MG5WKVSCFJTWH5AMHKL.json","view_paper":"https://pith.science/paper/UL5I747M","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=2605.00768&json=true","fetch_graph":"https://pith.science/api/pith-number/UL5I747MG5WKVSCFJTWH5AMHKL/graph.json","fetch_events":"https://pith.science/api/pith-number/UL5I747MG5WKVSCFJTWH5AMHKL/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/UL5I747MG5WKVSCFJTWH5AMHKL/action/timestamp_anchor","attest_storage":"https://pith.science/pith/UL5I747MG5WKVSCFJTWH5AMHKL/action/storage_attestation","attest_author":"https://pith.science/pith/UL5I747MG5WKVSCFJTWH5AMHKL/action/author_attestation","sign_citation":"https://pith.science/pith/UL5I747MG5WKVSCFJTWH5AMHKL/action/citation_signature","submit_replication":"https://pith.science/pith/UL5I747MG5WKVSCFJTWH5AMHKL/action/replication_record"}},"created_at":"2026-05-20T01:05:15.036863+00:00","updated_at":"2026-05-20T01:05:15.036863+00:00"}