pith. machine review for the scientific record. sign in

SeeClick (Cheng et al., 2024) focused on finetuning an LMM to solely leverage screenshots as inputs to interact Imagine you are a robot browsing the web, just like humans

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.CL 1

years

2024 1

verdicts

UNVERDICTED 1

representative citing papers

WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models

cs.CL · 2024-01-25 · unverdicted · novelty 6.0

WebVoyager uses a large multimodal model to complete real-world web tasks end-to-end and reaches 59.1 percent success on a new benchmark of 15 live sites, with an automatic GPT-4V evaluator that matches human judgments 85 percent of the time.

citing papers explorer

Showing 1 of 1 citing paper.

  • WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models cs.CL · 2024-01-25 · unverdicted · none · ref 5

    WebVoyager uses a large multimodal model to complete real-world web tasks end-to-end and reaches 59.1 percent success on a new benchmark of 15 live sites, with an automatic GPT-4V evaluator that matches human judgments 85 percent of the time.