Mapping Natural Language Commands to Web Elements
classification
💻 cs.CL
keywords
articlelanguagecapturecommandsdatasetenvironmentnaturalphenomena
read the original abstract
The web provides a rich, open-domain environment with textual, structural, and spatial properties. We propose a new task for grounding language in this environment: given a natural language command (e.g., "click on the second article"), choose the correct element on the web page (e.g., a hyperlink or text box). We collected a dataset of over 50,000 commands that capture various phenomena such as functional references (e.g. "find who made this site"), relational reasoning (e.g. "article by john"), and visual reasoning (e.g. "top-most article"). We also implemented and analyzed three baseline models that capture different phenomena present in the dataset.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.