arXiv preprint arXiv:2603.01357 , year=

ASTRA-bench: Evaluating Tool-Use Agent Reasoning, Action Planning with Personal User Context , author= · 2026 · arXiv 2603.01357

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

SentinelBench: A Benchmark for Long-Running Monitoring Agents

cs.AI · 2026-06-03 · unverdicted · novelty 7.0

SentinelBench is a new benchmark for time-evolving monitoring tasks in web environments, measuring task completion, reaction time, and resource use with baselines from three models and two harnesses.

Opal: Private Memory for Personal AI

cs.CR · 2026-04-02 · unverdicted · novelty 6.0

Opal enables private long-term memory for personal AI by decoupling reasoning to a trusted enclave with a lightweight knowledge graph and piggybacking reindexing on ORAM accesses.

citing papers explorer

Showing 2 of 2 citing papers.

SentinelBench: A Benchmark for Long-Running Monitoring Agents cs.AI · 2026-06-03 · unverdicted · none · ref 25
SentinelBench is a new benchmark for time-evolving monitoring tasks in web environments, measuring task completion, reaction time, and resource use with baselines from three models and two harnesses.
Opal: Private Memory for Personal AI cs.CR · 2026-04-02 · unverdicted · none · ref 260
Opal enables private long-term memory for personal AI by decoupling reasoning to a trusted enclave with a lightweight knowledge graph and piggybacking reindexing on ORAM accesses.

arXiv preprint arXiv:2603.01357 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer