Audit trails for accountability in large language models

· 2026 · arXiv 2601.20727

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 2

citation-polarity summary

background 1 support 1

representative citing papers

As It Was: Aligning LLM Search Evaluation with Historical User Preferences

cs.IR · 2026-07-01 · unverdicted · novelty 7.0

Augmenting LLM search judges with historical QRI cards improves Spearman correlation with user preferences by ~5% overall (91% relative on disagreements) and 15% in multilingual settings, with better alignment to live A/B test outcomes.

Bureaucratic Silences: What the Canadian AI Register Reveals, Omits, and Obscures

cs.AI · 2026-04-16 · unverdicted · novelty 6.0

Analysis of Canada's Federal AI Register reveals it frames AI as reliable internal tooling by obscuring sociotechnical elements like human discretion, turning transparency into performative compliance.

Auditable Agents

cs.AI · 2026-04-07 · unverdicted · novelty 6.0

No agent system can be accountable without auditability, which requires five dimensions (action recoverability, lifecycle coverage, policy checkability, responsibility attribution, evidence integrity) and mechanisms for detect/enforce/recover.

From Agent Traces to Trust: A Survey of Evidence Tracing and Execution Provenance in LLM Agents

cs.CR · 2026-06-03 · unverdicted · novelty 5.0 · 2 refs

This survey defines execution provenance as a typed graph of agent execution and evidence tracing as its projection onto evidence-support relations, then reviews methods, taxonomy, benchmarks, and challenges for auditable LLM agents.

Trustworthy Agent Network: Trust in Agent Networks Must Be Baked In, Not Bolted On

cs.AI · 2026-05-18 · unverdicted · novelty 4.0

Argues that trustworthiness in Agent-to-Agent networks requires a new conceptual framework with four design pillars baked in from the beginning, as retrofitting existing single-agent methods is insufficient.

Responsible Agentic AI Requires Explicit Provenance

cs.AI · 2026-05-16 · unverdicted · novelty 4.0

Explicit provenance across the full agentic AI lifecycle is the necessary condition for making responsibility computable and actionable.

Reinforcement Learning from Human Feedback: A Statistical Perspective

stat.ML · 2026-04-02 · accept · novelty 2.0

A statistical survey of RLHF for LLM alignment that connects preference learning and policy optimization to models like Bradley-Terry-Luce while reviewing methods, extensions, and open challenges.

citing papers explorer

Showing 1 of 1 citing paper after filters.

As It Was: Aligning LLM Search Evaluation with Historical User Preferences cs.IR · 2026-07-01 · unverdicted · none · ref 12
Augmenting LLM search judges with historical QRI cards improves Spearman correlation with user preferences by ~5% overall (91% relative on disagreements) and 15% in multilingual settings, with better alignment to live A/B test outcomes.

Audit trails for accountability in large language models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer