pith. sign in
Pith Number

pith:WOKFU2YA

pith:2026:WOKFU2YA4W5244ULW5ADOJXD6V
not attested not anchored not stored refs pending

Chat2Workflow: A Benchmark for Generating Executable Visual Workflows with Natural Language

Buqiang Xu, Guozhou Zheng, Ningyu Zhang, Shuofei Qiao, Yijun Wang, Yi Zhong, Zifei Shan

Large language models capture high-level intent but struggle to produce correct, stable, executable visual workflows from natural language.

arxiv:2604.19667 v2 · 2026-04-21 · cs.CL · cs.AI · cs.CV · cs.LG · cs.MA

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{WOKFU2YA4W5244ULW5ADOJXD6V}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

while state-of-the-art language models can often capture high-level intent, they struggle to generate correct, stable, and executable workflows, especially under complex or changing requirements. Although our agentic framework yields up to 5.34% resolve rate gains, the remaining real-world gap positions Chat2Workflow as a foundation for advancing industrial-grade automation.

C2weakest assumption

The collected real-world business workflows are representative of practical industrial needs and that generated workflows can be transformed and directly deployed to platforms such as Dify and Coze without loss of intended functionality.

C3one line summary

Chat2Workflow benchmark shows that state-of-the-art LLMs often grasp high-level intent for visual workflow generation but fail to produce correct, stable, executable outputs, with an agentic framework delivering only modest 5.34% gains.

Receipt and verification
First computed 2026-05-27T02:05:20.012379Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

b3945a6b00e5bbae728bb7403726e3f555c4fad04f353c5f4059668581b9b447

Aliases

arxiv: 2604.19667 · arxiv_version: 2604.19667v2 · doi: 10.48550/arxiv.2604.19667 · pith_short_12: WOKFU2YA4W52 · pith_short_16: WOKFU2YA4W5244UL · pith_short_8: WOKFU2YA
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/WOKFU2YA4W5244ULW5ADOJXD6V \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: b3945a6b00e5bbae728bb7403726e3f555c4fad04f353c5f4059668581b9b447
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "d5ae84bb94af6bfa0d1f21ed38ae070ccc8903e311024ad5e1b88c9ec0d91bbd",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CV",
      "cs.LG",
      "cs.MA"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2026-04-21T16:49:11Z",
    "title_canon_sha256": "41da58f1e93571f91bcb2071e591c669ddaa6f1ff535e36b5a25c91ce5606168"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2604.19667",
    "kind": "arxiv",
    "version": 2
  }
}