Cogagent: A visual language model for gui agents

11 Published as a conference paper at ICLR · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

cs.IR · 2024-10-14 · conditional · novelty 7.0

VisRAG achieves 20-40% better end-to-end performance than text-based RAG by directly embedding and retrieving document images with VLMs.

Showing 1 of 1 citing paper.

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents cs.IR · 2024-10-14 · conditional · none · ref 6
VisRAG achieves 20-40% better end-to-end performance than text-based RAG by directly embedding and retrieving document images with VLMs.