pith. sign in

hub Mixed citations

Learning transferable visual models from natural language supervision

Mixed citation behavior. Most common role is background (57%).

22 Pith papers citing it
Background 57% of classified citations

hub tools

citation-role summary

background 4 method 3

citation-polarity summary

representative citing papers

OP2GS: Object-Aware 3D Gaussian Splatting with Dual-Opacity Primitives

cs.CV · 2026-05-19 · unverdicted · novelty 7.0

OP2GS adds instance identities and dual opacities to 3D Gaussians so that visual rendering and object-mask rendering are handled by separate opacity channels, reducing label contamination while attaching semantics at the object level.

Adaptive Subspace Projection for Generative Personalization

cs.CV · 2026-05-08 · unverdicted · novelty 7.0

A training-free adaptive subspace projection method mitigates semantic collapsing in generative personalization by isolating and adjusting drift in a low-dimensional subspace using the stable pre-trained embedding as anchor.

The Indra Representation Hypothesis for Multimodal Alignment

cs.CV · 2026-04-06 · unverdicted · novelty 7.0

Unimodal model representations converge to a relational structure captured by the Indra representation via V-enriched Yoneda embedding, which is unique and structure-preserving and improves cross-model and cross-modal robustness when instantiated with angular distance.

Improved Baselines with Representation Autoencoders

cs.CV · 2026-05-18 · conditional · novelty 6.0

RAE v2 reaches gFID 1.06 on ImageNet-256 in 80 epochs by combining multi-layer encoder sums, complementary REPA targets, and free guidance via output reparameterization.

Uncertainty-Aware Foundation Models for Clinical Data

cs.LG · 2026-04-05 · unverdicted · novelty 6.0

The work introduces uncertainty-aware foundation models for clinical data by learning set-valued patient representations that enforce consistency across partial observations and integrate multimodal self-supervised objectives.

Grounded Reinforcement Learning for Visual Reasoning

cs.CV · 2025-05-29 · unverdicted · novelty 6.0

ViGoRL introduces visually grounded RL that anchors reasoning steps to image coordinates and uses multi-turn zooming to outperform standard RL and supervised baselines on spatial and GUI reasoning benchmarks.

Memory-Efficient Continual Learning with CLIP Models

cs.LG · 2026-05-05 · unverdicted · novelty 5.0

A per-class loss reweighting scheme based on distributional robustness allows CLIP models to perform class-incremental and domain-incremental learning with minimal memory while limiting forgetting on CIFAR-100, ImageNet1K, and DomainNet.

The Amazing Stability of Flow Matching

cs.CV · 2026-04-17 · unverdicted · novelty 5.0

Flow matching generative models preserve sample quality, diversity, and latent representations despite pruning 50% of the CelebA-HQ dataset or altering architecture and training configurations.

citing papers explorer

Showing 22 of 22 citing papers.