Eyeclip: A visual-language foundation model for multi-modal ophthalmic image analysis

· 2024 · arXiv 2409.06644

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

CapCLIP: A Vision-Language Representation Alignment Approach for Wireless Capsule Endoscopy Analysis

cs.CV · 2026-05-08 · unverdicted · novelty 5.0

CapCLIP uses pathology-aware text captions to align WCE images in a vision-language space, outperforming standard models in zero-shot classification and retrieval on unseen data.

Representation learning from OCT images

cs.CV · 2026-05-04 · unverdicted · novelty 3.0

A structured survey of representation learning methods for retinal OCT image analysis, covering supervised, self-supervised, generative, multimodal, and foundation model approaches along with datasets and open problems.

citing papers explorer

Showing 2 of 2 citing papers.

CapCLIP: A Vision-Language Representation Alignment Approach for Wireless Capsule Endoscopy Analysis cs.CV · 2026-05-08 · unverdicted · none · ref 21
CapCLIP uses pathology-aware text captions to align WCE images in a vision-language space, outperforming standard models in zero-shot classification and retrieval on unseen data.
Representation learning from OCT images cs.CV · 2026-05-04 · unverdicted · none · ref 153
A structured survey of representation learning methods for retinal OCT image analysis, covering supervised, self-supervised, generative, multimodal, and foundation model approaches along with datasets and open problems.

Eyeclip: A visual-language foundation model for multi-modal ophthalmic image analysis

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer