pith. sign in

This function helps to navigate all relations in the KB connected to the variable, so you can decide which relation is the most useful to find the answer to the question

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.AI 1

years

2023 1

verdicts

UNVERDICTED 1

representative citing papers

AgentBench: Evaluating LLMs as Agents

cs.AI · 2023-08-07 · unverdicted · novelty 8.0

AgentBench is a new multi-environment benchmark showing commercial LLMs outperform open-source models up to 70B parameters in agent tasks mainly due to better long-term reasoning and instruction following.

citing papers explorer

Showing 1 of 1 citing paper.

  • AgentBench: Evaluating LLMs as Agents cs.AI · 2023-08-07 · unverdicted · none · ref 8

    AgentBench is a new multi-environment benchmark showing commercial LLMs outperform open-source models up to 70B parameters in agent tasks mainly due to better long-term reasoning and instruction following.