pith. sign in

Deepshop: A benchmark for deep research shopping agents.ArXiv, abs/2506.02839

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

citation-role summary

background 1 baseline 1 dataset 1

citation-polarity summary

years

2026 5 2025 3

representative citing papers

MolmoWeb: Open Visual Web Agent and Open Data for the Open Web

cs.CV · 2026-04-09 · unverdicted · novelty 7.0

Open 4B and 8B visual web agents achieve state-of-the-art results on browser benchmarks by predicting actions from screenshots and instructions, outperforming similar open models and some closed larger-model agents, with full release of data and code planned.

WebMall -- A Multi-Shop Benchmark for Evaluating Web Agents

cs.CL · 2025-08-18 · conditional · novelty 7.0

WebMall is the first offline multi-shop benchmark for evaluating LLM web agents on complex comparison shopping tasks across heterogeneous product data from multiple simulated e-shops.

citing papers explorer

Showing 8 of 8 citing papers.