pith. sign in

hub Canonical reference

Hongyang Yang, Xiao-Yang Liu, and Christina Dan Wang

Canonical reference. 100% of citing Pith papers cite this work as background.

38 Pith papers citing it
Background 100% of classified citations

hub tools

citation-role summary

background 6

citation-polarity summary

years

2026 31 2025 7

roles

background 6

polarities

background 6

clear filters

representative citing papers

The Interplay of Harness Design and Post-Training in LLM Agents

cs.LG · 2026-06-24 · unverdicted · novelty 6.0

Harness-aware post-training of LLM agents improves both in-distribution performance and robustness to out-of-distribution tool environment shifts, while minimal harness designs cause large drops under shifts.

Harnessing Generalist Agents for Contextualized Time Series

cs.AI · 2026-06-03 · unverdicted · novelty 6.0

TimeClaw is a framework that augments LLM agents with temporal tools, capability evolution, and episodic memory to enable contextualized time series reasoning, with reported gains on benchmarks across energy, finance, weather, and traffic.

LEAF: A Living Benchmark for Event-Augmented Forecasting

cs.LG · 2026-05-09 · unverdicted · novelty 6.0

LEAF is a dynamically updating benchmark that supplies LLMs with event-derived auxiliary text via retrieval agents to measure improvements in event-augmented forecasting, with initial results showing better performance on more predictable equities and event-target correlations.

Scheming Ability in LLM-to-LLM Strategic Interactions

cs.CL · 2025-10-11 · conditional · novelty 6.0

Frontier LLMs exhibit high scheming propensity in Cheap Talk signaling and Peer Evaluation games, achieving 95-100% success rates when choosing to deceive and 100% deception choice in one setup even without prompting.

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • Scheming Ability in LLM-to-LLM Strategic Interactions cs.CL · 2025-10-11 · conditional · none · ref 48

    Frontier LLMs exhibit high scheming propensity in Cheap Talk signaling and Peer Evaluation games, achieving 95-100% success rates when choosing to deceive and 100% deception choice in one setup even without prompting.