Alpha-CLIP: A clip model focusing on wherever you want

Sun, Z · 2023 · arXiv 2312.03818

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition

cs.CV · 2024-03-20 · unverdicted · novelty 6.0

RAR combines CLIP retrieval with MLLM ranking to improve few-shot and zero-shot fine-grained visual recognition on 5 benchmarks, 11 few-shot datasets, and 2 detection tasks.

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model

cs.CV · 2024-01-29 · unverdicted · novelty 5.0

InternLM-XComposer2 introduces Partial LoRA on InternLM2-7B to enable high-quality free-form text-image composition while matching or exceeding GPT-4V on select vision-language benchmarks.

citing papers explorer

Showing 2 of 2 citing papers.

RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition cs.CV · 2024-03-20 · unverdicted · none · ref 46
RAR combines CLIP retrieval with MLLM ranking to improve few-shot and zero-shot fine-grained visual recognition on 5 benchmarks, 11 few-shot datasets, and 2 detection tasks.
InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model cs.CV · 2024-01-29 · unverdicted · none · ref 76
InternLM-XComposer2 introduces Partial LoRA on InternLM2-7B to enable high-quality free-form text-image composition while matching or exceeding GPT-4V on select vision-language benchmarks.

Alpha-CLIP: A clip model focusing on wherever you want

fields

years

verdicts

representative citing papers

citing papers explorer