pith. sign in
Pith Number

pith:VMGB2BVN

pith:2026:VMGB2BVNJEHQP6PM7ESFQBTHGQ
not attested not anchored not stored refs resolved

Language, Place, and Social Media: Geographic Dialect Alignment in New Zealand

Sidney Wong

New Zealand Reddit users link language to place and form contiguous speech communities with complex geographic alignment; Word2Vec embeddings reveal semantic variations and shifts in NZ English on a 4.26 billion word corpus.

arxiv:2604.15744 v1 · 2026-04-17 · cs.CL

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{VMGB2BVNJEHQP6PM7ESFQBTHGQ}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Users generally associate language with place, and place-related communities form a contiguous speech community, though alignment between geographic dialect communities and place-related communities remains complex. Advanced language modelling, including static and diachronic Word2Vec language embeddings, revealed semantic variation across place-based communities and meaningful semantic shifts within New Zealand English.

C2weakest assumption

That Reddit communities tied to places accurately represent geographic dialect communities and that user perceptions of language-place links correspond to measurable patterns in actual language use.

C3one line summary

New Zealand Reddit users link language to place and form contiguous speech communities with complex geographic alignment; Word2Vec embeddings reveal semantic variations and shifts in NZ English on a 4.26 billion word corpus.

References

293 extracted · 293 resolved · 1 Pith anchors

[1] Abell, M. and Gordon, E. (1990). This objectionable colonial dialect': historical and contemporary attitudes to New Zealand speech. In Bell, A. and Holmes, J., editors, New Zealand Ways of Speaking En 1990
[2] They Had Us In the First Half 2019
[3] Adams, N. N. (2022). ' Scraping ' Reddit posts for academic research? Addressing some blurred lines of consent in growing internet-based research trend during the time of COVID -19. International jour 2022 · doi:10.1080/13645579.2022.2111816
[4] Agha, A. (2003). The social life of cultural value. Language & Communication , 23(3):231--273. https://doi.org/10.1016/S0271-5309(03)00012-0 2003 · doi:10.1016/s0271-5309(03)00012-0
[5] Agnew, J. A. (1987). Place and Politics : The Geographical Mediation of State and Society , volume 1 of Routeledge Library Editions : Political Geography . Routledge, Abingdon, England; New York, NY, 1987
Receipt and verification
First computed 2026-06-03T01:05:13.757647Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

ab0c1d06ad490f07f9ecf924580667340efe9aff9dd70bf52134ffdcd27f6162

Aliases

arxiv: 2604.15744 · arxiv_version: 2604.15744v1 · doi: 10.48550/arxiv.2604.15744 · pith_short_12: VMGB2BVNJEHQ · pith_short_16: VMGB2BVNJEHQP6PM · pith_short_8: VMGB2BVN
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/VMGB2BVNJEHQP6PM7ESFQBTHGQ \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: ab0c1d06ad490f07f9ecf924580667340efe9aff9dd70bf52134ffdcd27f6162
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "635fa21eb07200169cfcbca918cb952a64f615ae959c612e6b958ab193bad5da",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2026-04-17T06:37:25Z",
    "title_canon_sha256": "6896165c79aec3b6bf3217edc533ec69e0acbae99cab7142eeb24adc0af1141f"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2604.15744",
    "kind": "arxiv",
    "version": 1
  }
}