{"paper":{"title":"STAR: Failure-Aware Markovian Routing for Multi-Agent Spatiotemporal Reasoning","license":"http://creativecommons.org/licenses/by-nc-sa/4.0/","headline":"STAR models inter-agent routing as a Markovian transition policy conditioned on typed failure states to learn specific recovery transitions from unsuccessful traces.","cross_cats":["cs.MA"],"primary_cat":"cs.AI","authors_text":"Flora D. Salim, Hao Xue, Lihuan Li, Ruiyi Yang","submitted_at":"2026-05-11T06:34:49Z","abstract_excerpt":"Compositional spatiotemporal reasoning often requires a system to invoke multiple heterogeneous specialists, such as geometric, temporal, topological, and trajectory agents. A central question is how such a system should route among specialists when execution does not simply succeed or fail, but fails in qualitatively different ways. Existing tool-augmented and multi-agent LLM systems typically leave this routing decision implicit in language generation, making recovery ad hoc, difficult to interpret, and hard to optimize. This paper presents STAR (Spatio-Temporal Agent Router), a failure-awar"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Results prove that retaining unsuccessful traces during training enlarges the support of the routing policy on error states, enabling recovery transitions that success-only training cannot represent. Across three spatiotemporal benchmarks and eight backbone LLMs, STAR improves over multiple baselines with the clearest gains on queries whose execution deviates from the nominal routing path.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The framework assumes that failure states can be accurately and consistently typed into distinct categories (malformed outputs, missing dependencies, tool-query mismatches) during execution, allowing the routing matrix to condition recovery transitions on these types rather than collapsing them into a generic signal, as described in the central routing mechanism.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"STAR presents a failure-aware routing framework using a state-conditioned transition policy and an agent routing matrix combining expert routes with learned recoveries from execution traces to improve multi-agent spatiotemporal reasoning.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"STAR models inter-agent routing as a Markovian transition policy conditioned on typed failure states to learn specific recovery transitions from unsuccessful traces.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"fb7c2a3fa40bdd80ce6d7cf880a88fd27cc362539335de007b0e3e80d1b0b17e"},"source":{"id":"2605.10057","kind":"arxiv","version":3},"verdict":{"id":"7aeda271-444c-4e6a-8f0d-758035e59d00","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-19T14:40:20.404072Z","strongest_claim":"Results prove that retaining unsuccessful traces during training enlarges the support of the routing policy on error states, enabling recovery transitions that success-only training cannot represent. Across three spatiotemporal benchmarks and eight backbone LLMs, STAR improves over multiple baselines with the clearest gains on queries whose execution deviates from the nominal routing path.","one_line_summary":"STAR presents a failure-aware routing framework using a state-conditioned transition policy and an agent routing matrix combining expert routes with learned recoveries from execution traces to improve multi-agent spatiotemporal reasoning.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The framework assumes that failure states can be accurately and consistently typed into distinct categories (malformed outputs, missing dependencies, tool-query mismatches) during execution, allowing the routing matrix to condition recovery transitions on these types rather than collapsing them into a generic signal, as described in the central routing mechanism.","pith_extraction_headline":"STAR models inter-agent routing as a Markovian transition policy conditioned on typed failure states to learn specific recovery transitions from unsuccessful traces."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2605.10057/integrity.json","findings":[],"available":true,"detectors_run":[{"name":"ai_meta_artifact","ran_at":"2026-05-19T15:41:36.301820Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_title_agreement","ran_at":"2026-05-19T12:01:17.980962Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_compliance","ran_at":"2026-05-19T09:41:59.557081Z","status":"completed","version":"1.0.0","findings_count":0}],"snapshot_sha256":"6bedc5ff6dd778ded83581b3ce9002b16972d5cf489760cca181524b8551e978"},"references":{"count":32,"sample":[{"doi":"","year":2023,"title":"Can large language models be good path planners? a benchmark and investigation on spatial-temporal reasoning","work_id":"5b11b35a-08cc-4d40-985d-5524d83ff692","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2024,"title":"Graph of thoughts: Solving elaborate problems with large language models","work_id":"6c5125ef-d5d0-4032-a6bd-99d1d898edc1","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2025,"title":"V-star: Bench- marking video-llms on video spatio-temporal reasoning","work_id":"1a8331e9-feb0-4852-b486-606942c93723","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2025,"title":"From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review","work_id":"54318454-2b12-4363-af3b-c33a86476992","ref_index":4,"cited_arxiv_id":"2504.19678","is_internal_anchor":true},{"doi":"","year":2025,"title":"Tremu: Towards neuro-symbolic temporal reasoning for llm-agents with memory in multi-session dialogues","work_id":"56e1c9e5-5ccf-4ff6-8db4-aa144cf757f3","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":32,"snapshot_sha256":"0384071f7ae9231127a119c00c205d7430f855637a822e65a99d227666d6d21a","internal_anchors":6},"formal_canon":{"evidence_count":2,"snapshot_sha256":"d6cdd2a74c52b91006bc87c0acb2aac3915452557ad2b551548785e1c617c597"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}