It is built upon SigLIP-400M and Qwen2-7B (Yang et al.,

is an upgrade of MiniCPM-V 2 · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

cs.IR · 2024-10-14 · conditional · novelty 7.0

VisRAG achieves 20-40% better end-to-end performance than text-based RAG by directly embedding and retrieving document images with VLMs.

Showing 1 of 1 citing paper.

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents cs.IR · 2024-10-14 · conditional · none · ref 38
VisRAG achieves 20-40% better end-to-end performance than text-based RAG by directly embedding and retrieving document images with VLMs.