arXiv preprint arXiv:2307.03254 , year=

Clayton Fields, Casey Kennington · 2023 · arXiv 2307.03254

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

ESsEN: Training Compact Discriminative Vision-Language Transformers in a Low-Resource Setting

cs.CV · 2026-04-20 · unverdicted · novelty 5.0

ESsEN is a parameter-efficient two-tower vision-language transformer that matches larger models on discriminative tasks after training end-to-end with limited data and resources.

Edge Deep Learning in Computer Vision and Medical Diagnostics: A Comprehensive Survey

cs.CV · 2026-05-07 · unverdicted · novelty 4.0

A comprehensive survey of edge deep learning in computer vision and medical diagnostics that presents a novel categorization of hardware platforms by performance and usage scenarios.

Multilingual Vision-Language Models, A Survey

cs.CL · 2025-09-26 · accept · novelty 3.0

The survey identifies a key tension in multilingual vision-language models between language neutrality via contrastive learning and cultural awareness via diverse data, with most benchmarks relying on translation-based evaluation.

VIDA: A dataset for Visually Dependent Ambiguity in Multimodal Machine Translation

cs.CL · 2026-05-03

citing papers explorer

Showing 4 of 4 citing papers.

ESsEN: Training Compact Discriminative Vision-Language Transformers in a Low-Resource Setting cs.CV · 2026-04-20 · unverdicted · none · ref 24
ESsEN is a parameter-efficient two-tower vision-language transformer that matches larger models on discriminative tasks after training end-to-end with limited data and resources.
Edge Deep Learning in Computer Vision and Medical Diagnostics: A Comprehensive Survey cs.CV · 2026-05-07 · unverdicted · none · ref 85
A comprehensive survey of edge deep learning in computer vision and medical diagnostics that presents a novel categorization of hardware platforms by performance and usage scenarios.
Multilingual Vision-Language Models, A Survey cs.CL · 2025-09-26 · accept · none · ref 50
The survey identifies a key tension in multilingual vision-language models between language neutrality via contrastive learning and cultural awareness via diverse data, with most benchmarks relying on translation-based evaluation.
VIDA: A dataset for Visually Dependent Ambiguity in Multimodal Machine Translation cs.CL · 2026-05-03 · unreviewed · ref 95

arXiv preprint arXiv:2307.03254 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer