Event-graph substrates represent states as RDF triple logs, prove a duality reducing explanatory and counterfactual queries to causal-ancestor traversal, and outperform symbolic and parametric baselines on CLEVRER and a new Smallville benchmark.
Title resolution pending
7 Pith papers cite this work. Polarity classification is still indexing.
years
2026 7verdicts
UNVERDICTED 7representative citing papers
Goal clarifications lose nearly all value after 10% of execution while input clarifications retain value until roughly 50%, and asking any type past mid-trajectory hurts performance more than never asking.
Latent Cache Flow uses small adapters to jointly translate and compress KV caches between LLMs, enabling accurate communication even with mismatched contexts and outperforming both prior cache adapters and text in early tests.
Dialogue between partially-observing LLM agents cuts action conflicts by 40-83 points but lowers task success versus silent coordination, with new metrics exposing limited genuine world-model alignment.
AI agents on Moltbook reflect the specific behavioral traits of their linked human owners across multiple dimensions, with stronger transfer linked to greater privacy risks.
LLARS is a new integrated platform that combines collaborative prompt authoring, cost-controlled batch generation, and hybrid evaluation to help domain experts and developers jointly build and assess LLM systems.
AlphaEarth embeddings form a rotating 13-dimensional manifold where local geometry predicts retrieval quality, and an agentic system using nine geometric tools outperforms parametric reasoning on environmental queries.
citing papers explorer
-
Deterministic Event-Graph Substrates as World Models for Counterfactual Reasoning
Event-graph substrates represent states as RDF triple logs, prove a duality reducing explanatory and counterfactual queries to causal-ancestor traversal, and outperform symbolic and parametric baselines on CLEVRER and a new Smallville benchmark.
-
Ask Early, Ask Late, Ask Right: When Does Clarification Timing Matter for Long-Horizon Agents?
Goal clarifications lose nearly all value after 10% of execution while input clarifications retain value until roughly 50%, and asking any type past mid-trajectory hurts performance more than never asking.
-
Latent Cache Flow: Model-to-Model Communication Without Text
Latent Cache Flow uses small adapters to jointly translate and compress KV caches between LLMs, enabling accurate communication even with mismatched contexts and outperforming both prior cache adapters and text in early tests.
-
Embodied Multi-Agent Coordination by Aligning World Models Through Dialogue
Dialogue between partially-observing LLM agents cuts action conflicts by 40-83 points but lowers task success versus silent coordination, with new metrics exposing limited genuine world-model alignment.
-
Behavioral Transfer in AI Agents: Evidence and Privacy Implications
AI agents on Moltbook reflect the specific behavioral traits of their linked human owners across multiple dimensions, with stronger transfer linked to greater privacy risks.
-
LLARS: Enabling Domain Expert & Developer Collaboration for LLM Prompting, Generation and Evaluation
LLARS is a new integrated platform that combines collaborative prompt authoring, cost-controlled batch generation, and hybrid evaluation to help domain experts and developers jointly build and assess LLM systems.
-
Characterizing AlphaEarth Embedding Geometry for Agentic Environmental Reasoning
AlphaEarth embeddings form a rotating 13-dimensional manifold where local geometry predicts retrieval quality, and an agentic system using nine geometric tools outperforms parametric reasoning on environmental queries.