pith. sign in

hub Canonical reference

URL https://cacm.acm.org/research/ datasheets-for-datasets/

Canonical reference. 91% of citing Pith papers cite this work as background.

50 Pith papers citing it
Background 91% of classified citations

hub tools

citation-role summary

background 11

citation-polarity summary

clear filters

representative citing papers

Causal state binding predicts action control in language agents

cs.AI · 2026-05-10 · unverdicted · novelty 7.0 · 3 refs

Causal state binding is introduced as a framework that predicts action control in language agents, validated across large benchmarks and SWE-bench Lite where adding the measure raised issue-to-file hit@3 AUC from 0.873 to 0.935.

ProactBench: Beyond What The User Asked For

cs.LG · 2026-05-09 · unverdicted · novelty 7.0

ProactBench measures LLM conversational proactivity in three phases using 198 multi-agent dialogues and finds recovery behavior hard to predict from existing benchmarks.

OPT: Open Pre-trained Transformer Language Models

cs.CL · 2022-05-02 · unverdicted · novelty 7.0

OPT releases open decoder-only transformers up to 175B parameters that match GPT-3 performance at one-seventh the carbon cost, along with code and training logs.

Telenor Nordics Customer Service self-help corpus

cs.CL · 2026-05-26 · unverdicted · novelty 6.0

Presents a publicly available multilingual corpus of 1,122 customer service self-help documents in four Nordic languages totaling 274,599 words.

Rollout Cards: A Reproducibility Standard for Agent Research

cs.AI · 2026-05-12 · conditional · novelty 6.0

Rollout cards preserve complete agent rollout records and declare the reporting rules behind scores, enabling reproducible evaluation where changing only the rule can alter success rates by over 20 percentage points.

Auditable Agents

cs.AI · 2026-04-07 · unverdicted · novelty 6.0

No agent system can be accountable without auditability, which requires five dimensions (action recoverability, lifecycle coverage, policy checkability, responsibility attribution, evidence integrity) and mechanisms for detect/enforce/recover.

PaLM: Scaling Language Modeling with Pathways

cs.CL · 2022-04-05 · accept · novelty 6.0

PaLM 540B demonstrates continued scaling benefits by setting new few-shot SOTA results on hundreds of benchmarks and outperforming humans on BIG-bench.

The Eticas AI Risk Taxonomy: Open Infrastructure for Operationalizing AI Audits

cs.CY · 2026-07-02 · unverdicted · novelty 5.0

The Eticas AI Risk Taxonomy v2.0.0 organizes 76 risk subcategories across 10 categories and demonstrates operationalization by measuring PII disclosure in GPT-4-0314 at 0%, 51%, and 84% under increasing adversarial conditions, mapping to grade E with a SYSTEMIC pattern.

The Shift Toward Open and Reproducible AI Research

cs.AI · 2026-06-15 · unverdicted · novelty 5.0 · 2 refs

Longitudinal study of 56,800 AI papers finds sixfold increase in code+data sharing from 2014-2024 with inferred reproducibility rising from 28% to 64%.

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • Reflections on Traceability for Visualization Research cs.HC · 2026-04-15 · conditional · none · ref 26

    Visualization researchers propose traceability—recording abundant annotated artifacts, reporting curated research threads, and enabling reading via interfaces—as a way to ensure rigor and transparency in inherently unreproducible design processes.