InInternational Conference on Learning Representa- tions (ICLR)

Gaia: A benchmark for general AI assistants · 2025 · arXiv 2510.06186

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

From Knowing to Acting: Benchmarking Self-Awareness Capability of LLM Agents

cs.AI · 2026-06-09 · unverdicted · novelty 7.0

Introduces KAPRO framework and KAware dataset to benchmark LLM agents' self-awareness in distinguishing internal knowledge from external tool needs.

Towards trustworthy agentic AI: a comprehensive survey of safety, robustness, privacy, and system security

cs.AI · 2026-05-17 · unverdicted · novelty 2.0

A survey that maps risks along the agent workflow and consolidates metrics and benchmarks for safety, robustness, privacy, and security in agentic AI.

citing papers explorer

Showing 2 of 2 citing papers after filters.

From Knowing to Acting: Benchmarking Self-Awareness Capability of LLM Agents cs.AI · 2026-06-09 · unverdicted · none · ref 3
Introduces KAPRO framework and KAware dataset to benchmark LLM agents' self-awareness in distinguishing internal knowledge from external tool needs.
Towards trustworthy agentic AI: a comprehensive survey of safety, robustness, privacy, and system security cs.AI · 2026-05-17 · unverdicted · none · ref 34
A survey that maps risks along the agent workflow and consolidates metrics and benchmarks for safety, robustness, privacy, and security in agentic AI.

InInternational Conference on Learning Representa- tions (ICLR)

fields

years

verdicts

representative citing papers

citing papers explorer