pith:B6N5RPC6
WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?
Web agents based on large language models show some success on enterprise tasks but leave a large gap to full automation
arxiv:2403.07718 v5 · 2024-03-12 · cs.LG · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{B6N5RPC67O33FJY2I5FPMYLZZO}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
while current agents show promise on WorkArena, there remains a considerable gap towards achieving full task automation. Notably, our analysis uncovers a significant performance disparity between open and closed-source LLMs
The 33 tasks chosen for WorkArena are representative of the typical daily work of knowledge workers utilizing enterprise software systems.
WorkArena benchmark shows LLM web agents achieve partial success on enterprise tasks but have a substantial gap to full automation and perform worse with open-source models.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:53.769379Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
0f9bd8bc5efbb7b2a71a474af66179cb8b53111d2184deeaedb5e532799e08ad
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/B6N5RPC67O33FJY2I5FPMYLZZO \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 0f9bd8bc5efbb7b2a71a474af66179cb8b53111d2184deeaedb5e532799e08ad
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "241cc0cc95b853603ea2fb29976c470fc5f752468f33e0ea0bfdf7a31e2cb398",
"cross_cats_sorted": [
"cs.AI"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.LG",
"submitted_at": "2024-03-12T14:58:45Z",
"title_canon_sha256": "6ac53eeabc9ba4a7957514da4595c3bd216575a61e7de3fd99f2fd3b9d5a0af2"
},
"schema_version": "1.0",
"source": {
"id": "2403.07718",
"kind": "arxiv",
"version": 5
}
}