Title resolution pending

Learning Transferable Visual Models From Natural Language Supervision , author= · 2021

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Sparse Autoencoders as Plug-and-Play Firewalls for Adversarial Attack Detection in VLMs

cs.CV · 2026-05-08 · unverdicted · novelty 8.0

Sparse autoencoders inserted into VLMs and trained only for reconstruction can reliably detect adversarial attacks on images, including unseen domains and attack types.

Online Learning-to-Defer with Varying Experts

stat.ML · 2026-05-12 · unverdicted · novelty 7.0 · 2 refs

Presents the first online Learning-to-Defer algorithm achieving regret O((n + n_e) T^{2/3}) generally and O((n + n_e) sqrt(T)) under low noise for multiclass classification with varying experts.

NeuralBench: A Unifying Framework to Benchmark NeuroAI Models

cs.LG · 2026-05-08 · conditional · novelty 7.0

NeuralBench is a new benchmarking framework for neuroAI models on EEG data that finds foundation models only marginally outperform task-specific ones while many tasks like cognitive decoding stay highly challenging.

Spherical Flows for Sampling Categorical Data

stat.ML · 2026-05-07 · unverdicted · novelty 7.0

Spherical vMF flows reduce the continuity equation on the sphere to a scalar ODE in cosine similarity, enabling posterior-weighted sampling of categorical sequences via cross-entropy trained posteriors.

Probing Visual Planning in Image Editing Models

cs.CV · 2026-04-23 · unverdicted · novelty 7.0

Image editing models fail zero-shot visual planning on abstract mazes and queen puzzles but generalize after finetuning, yet still cannot match human zero-shot efficiency.

Weasel: Out-of-Domain Generalization for Web Agents via Importance-Diversity Data Selection

cs.LG · 2026-05-19 · unverdicted · novelty 6.0

Weasel is a trajectory selection method that optimizes importance-diversity for offline web-agent training, improving out-of-domain generalization and delivering 9.7-12.5x speedups on AgentTrek, NNetNav, WebArena, WorkArena, and MiniWob with Qwen and Gemma models.

Prefix-Adaptive Block Diffusion for Efficient Document Recognition

cs.CV · 2026-05-16 · unverdicted · novelty 6.0

PA-BDM adapts block diffusion by switching to causal intra-block denoising and dynamically committing reliable prefixes to KV cache, yielding higher accuracy and 71.6% higher throughput than a comparable baseline on document benchmarks.

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

cs.CV · 2024-03-05 · conditional · novelty 6.0

Biased noise sampling for rectified flows combined with a bidirectional text-image transformer architecture yields state-of-the-art high-resolution text-to-image results that scale predictably with model size.

ALLaVA: Harnessing GPT4V-Synthesized Data for Lite Vision-Language Models

cs.CL · 2024-02-18 · unverdicted · novelty 6.0

ALLaVA creates 1.3M GPT4V-synthesized samples enabling 4B VLMs to achieve competitive results on 17 benchmarks and match 7B/13B models on some tasks.

NeuralSet: A High-Performing Python Package for Neuro-AI

q-bio.NC · 2026-05-04 · unverdicted · novelty 5.0

NeuralSet is a scalable Python framework that unifies diverse neural recordings and stimuli with deep learning embeddings via metadata decoupling and lazy data extraction.

From Codebooks to VLMs: Evaluating Automated Visual Discourse Analysis for Climate Change on Social Media

cs.CV · 2026-04-23 · unverdicted · novelty 5.0

VLMs recover reliable population-level trends in climate change visual discourse on social media even when per-image accuracy is only moderate.

Unified Pix Token And Word Token Generative Language Model

cs.CV · 2026-05-13 · unverdicted · novelty 4.0

A new model unifies per-pixel and word tokens in a generative language model with per-pixel embeddings, color folding, and unsupervised image pretraining, reporting good performance on small models with limited data.

citing papers explorer

Showing 12 of 12 citing papers.

Sparse Autoencoders as Plug-and-Play Firewalls for Adversarial Attack Detection in VLMs cs.CV · 2026-05-08 · unverdicted · none · ref 34
Sparse autoencoders inserted into VLMs and trained only for reconstruction can reliably detect adversarial attacks on images, including unseen domains and attack types.
Online Learning-to-Defer with Varying Experts stat.ML · 2026-05-12 · unverdicted · none · ref 105 · 2 links
Presents the first online Learning-to-Defer algorithm achieving regret O((n + n_e) T^{2/3}) generally and O((n + n_e) sqrt(T)) under low noise for multiclass classification with varying experts.
NeuralBench: A Unifying Framework to Benchmark NeuroAI Models cs.LG · 2026-05-08 · conditional · none · ref 287
NeuralBench is a new benchmarking framework for neuroAI models on EEG data that finds foundation models only marginally outperform task-specific ones while many tasks like cognitive decoding stay highly challenging.
Spherical Flows for Sampling Categorical Data stat.ML · 2026-05-07 · unverdicted · none · ref 30
Spherical vMF flows reduce the continuity equation on the sphere to a scalar ODE in cosine similarity, enabling posterior-weighted sampling of categorical sequences via cross-entropy trained posteriors.
Probing Visual Planning in Image Editing Models cs.CV · 2026-04-23 · unverdicted · none · ref 18
Image editing models fail zero-shot visual planning on abstract mazes and queen puzzles but generalize after finetuning, yet still cannot match human zero-shot efficiency.
Weasel: Out-of-Domain Generalization for Web Agents via Importance-Diversity Data Selection cs.LG · 2026-05-19 · unverdicted · none · ref 2
Weasel is a trajectory selection method that optimizes importance-diversity for offline web-agent training, improving out-of-domain generalization and delivering 9.7-12.5x speedups on AgentTrek, NNetNav, WebArena, WorkArena, and MiniWob with Qwen and Gemma models.
Prefix-Adaptive Block Diffusion for Efficient Document Recognition cs.CV · 2026-05-16 · unverdicted · none · ref 40
PA-BDM adapts block diffusion by switching to causal intra-block denoising and dynamically committing reliable prefixes to KV cache, yielding higher accuracy and 71.6% higher throughput than a comparable baseline on document benchmarks.
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis cs.CV · 2024-03-05 · conditional · none · ref 67
Biased noise sampling for rectified flows combined with a bidirectional text-image transformer architecture yields state-of-the-art high-resolution text-to-image results that scale predictably with model size.
ALLaVA: Harnessing GPT4V-Synthesized Data for Lite Vision-Language Models cs.CL · 2024-02-18 · unverdicted · none · ref 68
ALLaVA creates 1.3M GPT4V-synthesized samples enabling 4B VLMs to achieve competitive results on 17 benchmarks and match 7B/13B models on some tasks.
NeuralSet: A High-Performing Python Package for Neuro-AI q-bio.NC · 2026-05-04 · unverdicted · none · ref 11
NeuralSet is a scalable Python framework that unifies diverse neural recordings and stimuli with deep learning embeddings via metadata decoupling and lazy data extraction.
From Codebooks to VLMs: Evaluating Automated Visual Discourse Analysis for Climate Change on Social Media cs.CV · 2026-04-23 · unverdicted · none · ref 3
VLMs recover reliable population-level trends in climate change visual discourse on social media even when per-image accuracy is only moderate.
Unified Pix Token And Word Token Generative Language Model cs.CV · 2026-05-13 · unverdicted · none · ref 3
A new model unifies per-pixel and word tokens in a generative language model with per-pixel embeddings, color folding, and unsupervised image pretraining, reporting good performance on small models with limited data.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer