pith. sign in
Pith Number

pith:N4KZ2OBI

pith:2026:N4KZ2OBIRTN5N3JZNDSUEJ32MB
not attested not anchored not stored refs resolved

Autonomous Intelligent Agents for Natural-Language-Driven Web Execution with Integrated Security Assurance

Shrey Tyagi, Siva Rama Krishna Varma Bayyavarapu, Vinil Pasupuleti

An AI agent framework converts natural-language instructions into reliable web test scripts and OWASP-aligned security probes.

arxiv:2605.15281 v1 · 2026-05-14 · cs.CR · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{N4KZ2OBIRTN5N3JZNDSUEJ32MB}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Evaluated across four production applications and 176 scenarios, the framework improves script generation success from 55% to 93%, achieves an 8x reduction in navigation failures, eliminates 80% of timing-related race conditions, and reduces test creation time by 75% compared to manual Selenium authoring. It detects 85% of authentication bypass vulnerabilities and 95% of input validation flaws with false positive rates below 12%.

C2weakest assumption

The four production applications and 176 scenarios used for evaluation are representative of typical web applications and that natural-language descriptions of attack scenarios can be reliably mapped to complete OWASP-aligned probes without missing critical edge cases or introducing bias in the reported detection rates.

C3one line summary

AI agents generate and execute natural-language web tests with built-in security validation, raising success rates and cutting failures in production applications.

References

21 extracted · 21 resolved · 4 Pith anchors

[1] Evaluating Large Language Models Trained on Code 2021 · arXiv:2107.03374
[2] Reducing Web Test Cases Aging by Means of Robust XPath Locators, 2014
[3] OW ASP Testing Guide v4.2, 2023
[4] The Tangled Web: A Guide to Securing Modern Web Applications, 2011
[5] Enemy of the State: A State-Aware Black-Box Web Vulnerability Scanner, 2012
Receipt and verification
First computed 2026-05-20T00:00:50.510046Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

6f159d38288cdbd6ed3968e542277a606c3a9018c70d554fc56d073f3c7f8f89

Aliases

arxiv: 2605.15281 · arxiv_version: 2605.15281v1 · doi: 10.48550/arxiv.2605.15281 · pith_short_12: N4KZ2OBIRTN5 · pith_short_16: N4KZ2OBIRTN5N3JZ · pith_short_8: N4KZ2OBI
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/N4KZ2OBIRTN5N3JZNDSUEJ32MB \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 6f159d38288cdbd6ed3968e542277a606c3a9018c70d554fc56d073f3c7f8f89
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "e07d5247f2ccda85101ec29a435e4016fbd108ec239017796ddec4550acc8d87",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CR",
    "submitted_at": "2026-05-14T18:00:30Z",
    "title_canon_sha256": "0533a52c9c6ddc4216dc7de3c863b7574648ef8b55b1f10de75e9f6d13384a85"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.15281",
    "kind": "arxiv",
    "version": 1
  }
}