ImagenWorld: Stress-testing image generation models with explainable human evaluation on open-ended real-world tasks

Tao Sun et al · 2026 · arXiv 2603.27862

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

TASTE: A Designer-Annotated Multi-Dimensional Preference Dataset for AI-Generated Graphic Design

cs.CV · 2026-05-20 · unverdicted · novelty 7.0

TASTE supplies designer multi-dimensional rankings of T2I graphic outputs with statistical validation showing moderate agreement and benchmarks where a TASTE-trained MLP outperforms off-the-shelf VLMs.

RewardHarness: Self-Evolving Agentic Post-Training

cs.AI · 2026-05-09 · unverdicted · novelty 7.0

RewardHarness self-evolves a tool-and-skill library from 100 preference examples to reach 47.4% accuracy on image-edit evaluation, beating GPT-5, and yields stronger RL-tuned models.

citing papers explorer

Showing 2 of 2 citing papers.

TASTE: A Designer-Annotated Multi-Dimensional Preference Dataset for AI-Generated Graphic Design cs.CV · 2026-05-20 · unverdicted · none · ref 34
TASTE supplies designer multi-dimensional rankings of T2I graphic outputs with statistical validation showing moderate agreement and benchmarks where a TASTE-trained MLP outperforms off-the-shelf VLMs.
RewardHarness: Self-Evolving Agentic Post-Training cs.AI · 2026-05-09 · unverdicted · none · ref 22
RewardHarness self-evolves a tool-and-skill library from 100 preference examples to reach 47.4% accuracy on image-edit evaluation, beating GPT-5, and yields stronger RL-tuned models.

ImagenWorld: Stress-testing image generation models with explainable human evaluation on open-ended real-world tasks

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer