pith. sign in
Pith Number

pith:XUBDE4GE

pith:2026:XUBDE4GE24JW6HP2CIHVG4SLF6
not attested not anchored not stored refs pending

CommonLID: Re-evaluating State-of-the-Art Language Identification Performance on Web Data

Abdulhamid Abubakar, Ahmad Mustafid, Ahmad Mustapha Wali, Akshata A, Amit Agarwal, Amr Keleg, Andrew Yates, Atnafu Lambebo Tonja, Azmine Toushik Wasi, Azril Hafizi Amirudin, Benjamin Rice, Beno\^it Sagot, Bruhan Kyomuhendo, Carol Muchemi, Casper Muziri, Catherine Arnett, Chalamalasetti Kranti, Cristina Aggazzotti, Cynthia Amol, Damian Stewart, Daniel Ruffinelli, David Anugraha, Dmitry Gaynullin, Ej Zhou, Esther Adenuga, Faisal Muhammad Adam, Fenal Ashokbhai Ilasariya, Filbert Aurelian Tjiaranata, Genta Indra Winata, Gouthami Vadithya, Hamada Nayel, Hande Celikkanat, Hend Al-Khalifa, Hitesh Laxmichand Patel, Idris Abdulmumin, Ikhlasul Akmal Hanif, Ilker Kesen, Ingrid Gabriela Franco Ramirez, Inshirah Idris, Jakhongir Saydaliev, Jean Maillard, Jesujoba O. Alabi, Joseph Marvin Imperial, Juan Pablo Mart\'inez, Jun Kevin, Kamohelo Makaaka, Karan Dua, Kenton Murray, Khang Nguyen, Konstantin Dobler, Kun Kerdthaisong, Lanwenn ar C'horr, Laurie Burchell, Leshem Choshen, Luca Foppiano, Luis Frentzen Salim, Malte Ostendorff, Manuel Goul\~ao, Mattes Ruckdeschel, Melika Nobakhtian, Michael Anugraha, Mike Zhang, Mithil Bangera, Muhammad Ravi Shulthan Habibi, My Chiffon Nguyen, Nadia Ghezaiel Hammouda, Nicholas Andrews, Nuhu Ibrahim, Pavel Stepachev, Pedro Ortiz Suarez, Quentin Pag\`es, Rafael Mosquera-G\'omez, Raia Abu Ahmad, Rasul Dent, Reem Alqifari, Rob van der Goot, Sara Hincapie-Monsalve, Sarah Luger, Saron Samuel, Seid Muhie Yimam, Shamsuddeen Hassan Muhammad, Shu Okabe, Sotaro Takeshita, Sowmya Vajjala, Srikant Panda, Tack Hwa Wong, Thibault Cl\'erice, Thom Vaughan, Tommaso Green, Vallerie Alexandra Putra, Verrah Otiende, Vicky Feliren, Vukosi Marivate, Weerayut Buaphet, Yassine Toughrai, Yeshil Bangera, Yiyuan Li

arxiv:2601.18026 v2 · 2026-01-25 · cs.CL

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{XUBDE4GE24JW6HP2CIHVG4SLF6}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Cited by

1 paper in Pith

Receipt and verification
First computed 2026-06-10T01:08:32.665071Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

bd023270c4d7136f1dfa120f53724b2fbd52aec09ca6968cde3343443911d0bc

Aliases

arxiv: 2601.18026 · arxiv_version: 2601.18026v2 · doi: 10.48550/arxiv.2601.18026 · pith_short_12: XUBDE4GE24JW · pith_short_16: XUBDE4GE24JW6HP2 · pith_short_8: XUBDE4GE
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/XUBDE4GE24JW6HP2CIHVG4SLF6 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: bd023270c4d7136f1dfa120f53724b2fbd52aec09ca6968cde3343443911d0bc
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "d4c794bf3be49ace26201155d50681e5bc6872f2a14c9dca3141f780511000e4",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2026-01-25T22:49:30Z",
    "title_canon_sha256": "5960ec8a45ccff55ac210147e8b7dfc768cd6ec615d3b979461dd0d56e3a118f"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2601.18026",
    "kind": "arxiv",
    "version": 2
  }
}