hub

arXiv preprint arXiv:2104.10972 , year=

· 2021 · arXiv 2104.10972

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

read on arXiv browse 12 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

dataset 2

citation-polarity summary

use dataset 2

representative citing papers

CanViT: Toward Active-Vision Foundation Models

cs.CV · 2026-03-23 · conditional · novelty 8.0

CanViT is the first task- and policy-agnostic AVFM pretrained via passive-to-active dense latent distillation on 13.2M scenes and 1B random glimpses, achieving 38.5% ADE20K mIoU in one glimpse and 84.5% ImageNet-1k top-1 after fine-tuning.

Weierstrass Positional Encoding for Vision Transformers

cs.CV · 2026-05-20 · unverdicted · novelty 6.0

WePE encodes 2D patch positions in Vision Transformers via Weierstrass elliptic functions on the complex plane to exploit double periodicity and derive relative positions algebraically.

Birds of a Feather Flock Together: Background-Invariant Representations via Linear Structure in VLMs

cs.CV · 2026-05-11 · unverdicted · novelty 6.0

Exploiting linear structure in VLM embeddings, a synthetic-data pre-training method yields background-invariant representations that exceed 90% worst-group accuracy on Waterbirds even under 100% spurious correlation with no minority examples in training.

Parameter-Efficient Adaptation of Pre-Trained Vision Foundation Models for Active and Passive Seismic Data Denoising

physics.geo-ph · 2026-04-30 · conditional · novelty 6.0

Adapting vision foundation models with LoRA and kurtosis-guided unsupervised test-time adaptation matches or exceeds domain-specific models for seismic denoising across multiple sites and unseen data.

Contrastive Semantic Projection: Faithful Neuron Labeling with Contrastive Examples

cs.CV · 2026-04-24 · unverdicted · novelty 6.0

Using contrastive examples with vision-language models and a new CLIP-based scoring method called CSP produces more faithful and granular neuron labels than prior activation-only approaches.

StableTTA: Improving Vision Model Performance by Training-free Test-Time Adaptation Methods

cs.CV · 2026-04-06 · unverdicted · novelty 6.0

StableTTA improves ImageNet-1K accuracy across 71 vision models by stabilizing logit aggregation under coherent-batch inference and enabling efficient single-forward-pass adaptation.

MePo: Meta Post-Refinement for Rehearsal-Free General Continual Learning

cs.AI · 2026-02-08 · unverdicted · novelty 6.0

MePo refines pretrained backbones via meta-learning on constructed pseudo tasks and initializes a meta covariance matrix to enable robust second-order alignment, yielding 12-15% gains on CIFAR-100, ImageNet-R and CUB-200 in rehearsal-free GCL settings.

LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

cs.CV · 2023-10-03 · unverdicted · novelty 6.0

LanguageBind aligns video, infrared, depth, and audio to a frozen language encoder via contrastive learning on the new VIDAL-10M dataset, extending video-language pretraining to N modalities.

MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices

cs.CV · 2023-12-28 · unverdicted · novelty 5.0

MobileVLM achieves on-par performance with much larger vision-language models on standard benchmarks while delivering state-of-the-art inference speeds of 21.5 tokens per second on Snapdragon 888 CPU and 65.3 on Jetson Orin GPU.

HandyLabel: Towards Post-Processing to Real-Time Annotation Using Skeleton Based Hand Gesture Recognition

cs.HC · 2025-11-27 · unverdicted · novelty 4.0

HandyLabel enables real-time data annotation by mapping hand gestures to labels, with ResNet50 on skeleton-preprocessed HaGRID data reaching 0.923 F1-score and 88.9% of 46 study participants preferring it to traditional post-processing tools.

4th Workshop on Maritime Computer Vision (MaCVi): Challenge Overview

cs.CV · 2026-04-14 · unverdicted · novelty 2.0

The report overviews five maritime computer vision benchmark challenges, their datasets, protocols, quantitative results, and top team approaches from the MaCVi 2026 workshop.

Page image classification for content-specific data processing

cs.IR · 2025-07-11

citing papers explorer

Showing 12 of 12 citing papers.

CanViT: Toward Active-Vision Foundation Models cs.CV · 2026-03-23 · conditional · none · ref 58
CanViT is the first task- and policy-agnostic AVFM pretrained via passive-to-active dense latent distillation on 13.2M scenes and 1B random glimpses, achieving 38.5% ADE20K mIoU in one glimpse and 84.5% ImageNet-1k top-1 after fine-tuning.
Weierstrass Positional Encoding for Vision Transformers cs.CV · 2026-05-20 · unverdicted · none · ref 32
WePE encodes 2D patch positions in Vision Transformers via Weierstrass elliptic functions on the complex plane to exploit double periodicity and derive relative positions algebraically.
Birds of a Feather Flock Together: Background-Invariant Representations via Linear Structure in VLMs cs.CV · 2026-05-11 · unverdicted · none · ref 33
Exploiting linear structure in VLM embeddings, a synthetic-data pre-training method yields background-invariant representations that exceed 90% worst-group accuracy on Waterbirds even under 100% spurious correlation with no minority examples in training.
Parameter-Efficient Adaptation of Pre-Trained Vision Foundation Models for Active and Passive Seismic Data Denoising physics.geo-ph · 2026-04-30 · conditional · none · ref 46
Adapting vision foundation models with LoRA and kurtosis-guided unsupervised test-time adaptation matches or exceeds domain-specific models for seismic denoising across multiple sites and unseen data.
Contrastive Semantic Projection: Faithful Neuron Labeling with Contrastive Examples cs.CV · 2026-04-24 · unverdicted · none · ref 24
Using contrastive examples with vision-language models and a new CLIP-based scoring method called CSP produces more faithful and granular neuron labels than prior activation-only approaches.
StableTTA: Improving Vision Model Performance by Training-free Test-Time Adaptation Methods cs.CV · 2026-04-06 · unverdicted · none · ref 27
StableTTA improves ImageNet-1K accuracy across 71 vision models by stabilizing logit aggregation under coherent-batch inference and enabling efficient single-forward-pass adaptation.
MePo: Meta Post-Refinement for Rehearsal-Free General Continual Learning cs.AI · 2026-02-08 · unverdicted · none · ref 5
MePo refines pretrained backbones via meta-learning on constructed pseudo tasks and initializes a meta covariance matrix to enable robust second-order alignment, yielding 12-15% gains on CIFAR-100, ImageNet-R and CUB-200 in rehearsal-free GCL settings.
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment cs.CV · 2023-10-03 · unverdicted · none · ref 208
LanguageBind aligns video, infrared, depth, and audio to a frozen language encoder via contrastive learning on the new VIDAL-10M dataset, extending video-language pretraining to N modalities.
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices cs.CV · 2023-12-28 · unverdicted · none · ref 101
MobileVLM achieves on-par performance with much larger vision-language models on standard benchmarks while delivering state-of-the-art inference speeds of 21.5 tokens per second on Snapdragon 888 CPU and 65.3 on Jetson Orin GPU.
HandyLabel: Towards Post-Processing to Real-Time Annotation Using Skeleton Based Hand Gesture Recognition cs.HC · 2025-11-27 · unverdicted · none · ref 26
HandyLabel enables real-time data annotation by mapping hand gestures to labels, with ResNet50 on skeleton-preprocessed HaGRID data reaching 0.923 F1-score and 88.9% of 46 study participants preferring it to traditional post-processing tools.
4th Workshop on Maritime Computer Vision (MaCVi): Challenge Overview cs.CV · 2026-04-14 · unverdicted · none · ref 45
The report overviews five maritime computer vision benchmark challenges, their datasets, protocols, quantitative results, and top team approaches from the MaCVi 2026 workshop.
Page image classification for content-specific data processing cs.IR · 2025-07-11 · unreviewed · ref 16

arXiv preprint arXiv:2104.10972 , year=

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer