KG-CFR decouples planning from execution via knowledge-grounded counterfactual reasoning, preventing critical degradation in over 95% of perturbed runs and raising argument quality from 0.694 to 0.822 in a 1v1v1 simulation.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
COPAL reveals a 33.1% average error rate on composed-policy queries across nine LLM chatbots, showing that existing single-policy benchmarks miss common failures.
Position paper claiming that AI safety requires explicit runtime controllability and introducing ControlBench to demonstrate gaps in existing alignment methods.
citing papers explorer
-
Decoupling Thought from Speech: Knowledge-Grounded Counterfactual Reasoning for Resilient Multi-Agent Argumentation
KG-CFR decouples planning from execution via knowledge-grounded counterfactual reasoning, preventing critical degradation in over 95% of perturbed runs and raising argument quality from 0.694 to 0.822 in a 1v1v1 simulation.
-
Beyond Single-Policy: Evaluating Composed Organization-Specific Policy Alignment in LLM Chatbots
COPAL reveals a 33.1% average error rate on composed-policy queries across nine LLM chatbots, showing that existing single-policy benchmarks miss common failures.
-
Position: AI Safety Requires Effective Controllability
Position paper claiming that AI safety requires explicit runtime controllability and introducing ControlBench to demonstrate gaps in existing alignment methods.