Alfred: A benchmark for interpret- ing grounded instructions for everyday tasks

· 2020

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

MirrorBench: Evaluating Self-centric Intelligence in MLLMs by Introducing a Mirror

cs.AI · 2026-04-16 · unverdicted · novelty 7.0

MirrorBench reveals that leading MLLMs perform far below humans on tasks requiring self-referential perception and representation, even at the simplest level.

AnyUser: Translating Sketched User Intent into Domestic Robots

cs.RO · 2026-04-06 · unverdicted · novelty 5.0

AnyUser translates free-form sketches on images plus optional language into executable robot actions for domestic tasks using multimodal fusion and a hierarchical policy.

citing papers explorer

Showing 2 of 2 citing papers.

MirrorBench: Evaluating Self-centric Intelligence in MLLMs by Introducing a Mirror cs.AI · 2026-04-16 · unverdicted · none · ref 4
MirrorBench reveals that leading MLLMs perform far below humans on tasks requiring self-referential perception and representation, even at the simplest level.
AnyUser: Translating Sketched User Intent into Domestic Robots cs.RO · 2026-04-06 · unverdicted · none · ref 4
AnyUser translates free-form sketches on images plus optional language into executable robot actions for domestic tasks using multimodal fusion and a hierarchical policy.

Alfred: A benchmark for interpret- ing grounded instructions for everyday tasks

fields

years

verdicts

representative citing papers

citing papers explorer