pith. sign in

Proceedings of the 2023

26 Pith papers cite this work, alongside 220 external citations. Polarity classification is still indexing.

26 Pith papers citing it
220 external citations · Crossref

citation-role summary

background 3

citation-polarity summary

roles

background 3

polarities

background 3

representative citing papers

Evaluating Commercial AI Chatbots as News Intermediaries

cs.CL · 2026-05-21 · conditional · novelty 7.0

Commercial AI chatbots reach over 90% multiple-choice accuracy on recent news facts but lose 11-17% in free response and drop to 19-70% on subtle false-premise questions, with retrieval failures causing most errors and clear Anglophone bias.

TSVer: A Benchmark for Fact Verification Against Time-Series Evidence

cs.CL · 2025-11-02 · unverdicted · novelty 7.0

TSVer is a new benchmark dataset for fact verification against time-series evidence, with 304 annotated real-world claims, 400 time series, verdicts, and justifications, plus baseline results showing current models struggle.

Argus: Evidence Assembly for Scalable Deep Research Agents

cs.CL · 2026-05-15 · unverdicted · novelty 6.0 · 2 refs

Argus coordinates a Navigator and multiple Searchers via an evidence graph for deep research, reporting average gains of 5.5 points with one Searcher and 12.7 points with eight parallel Searchers across eight benchmarks, reaching 86.2 on BrowseComp with 64 Searchers.

Self-Attention as Transport: Limits of Symmetric Spectral Diagnostics

cs.LG · 2026-05-06 · unverdicted · novelty 6.0

Symmetric spectral diagnostics on attention are structurally blind to flow direction, with asymmetry G as the sole control parameter, yielding a two-axis test that distinguishes bottleneck versus diffuse hallucination modes with opposite polarity.

Architecture Determines Observability of Transformers

cs.LG · 2026-04-27 · unverdicted · novelty 6.0 · 2 refs

Architecture and training determine whether transformers retain a readable internal signal that lets activation monitors catch errors missed by output confidence.

Textual Bayes: Quantifying Prompt Uncertainty in LLM-Based Systems

cs.LG · 2025-06-11 · unverdicted · novelty 6.0

Introduces a Bayesian framework viewing LLM prompts as textual parameters and proposes MHLP, a novel MCMC algorithm using LLM proposals, to perform inference and improve accuracy plus uncertainty quantification on benchmarks.

A Survey on LLM-as-a-Judge

cs.CL · 2024-11-23 · unverdicted · novelty 4.0

A survey on LLM-as-a-Judge that reviews reliability strategies, proposes evaluation methods, and introduces a novel benchmark for assessing such systems.

citing papers explorer

Showing 26 of 26 citing papers.