Title resolution pending

He,K · 2016

25 Pith papers cite this work. Polarity classification is still indexing.

25 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

method 3 baseline 1

citation-polarity summary

use method 3 baseline 1

representative citing papers

Can Vision Models Truly Forget? Mirage: Representation-Level Certification of Visual Unlearning

cs.CV · 2026-05-19 · unverdicted · novelty 7.0

Mirage auditing reveals that VFL unlearning methods passing output-level checks still retain substantial class structure in representations across multiple datasets and baselines.

How to Evaluate and Refine your CAM

cs.CV · 2026-05-14 · unverdicted · novelty 7.0

Introduces synthetic ground-truth dataset for CAM evaluation, proposes ARCC composite metric, and RefineCAM method that aggregates layers for higher-resolution maps outperforming baselines.

PromptDx: Differentiable Prompt Tuning for Multimodal In-Context Alzheimer's Diagnosis

cs.CV · 2026-05-09 · unverdicted · novelty 7.0

PromptDx adds a differentiable adapter to align multimodal data with a pre-trained TabPFN-style ICL engine, achieving strong Alzheimer's diagnosis performance with only 1% context samples.

A High-Resolution Landscape Dataset for Concept-Based XAI With Application to Species Distribution Models

cs.CV · 2026-04-14 · unverdicted · novelty 7.0

A new open-access landscape concept dataset enables the first application of Robust TCAV to deep learning species distribution models, validating predictions against expert knowledge and uncovering novel ecological associations for two aquatic insect groups.

MoCA3D: Monocular 3D Bounding Box Prediction in the Image Plane

cs.CV · 2026-03-20 · unverdicted · novelty 7.0

MoCA3D formulates monocular 3D box prediction as dense pixel-space tasks using corner heatmaps and depth maps, with a new PAG metric for image-plane evaluation.

Echo4DIR: 4D Implicit Heart Reconstruction from 2D Echocardiography Videos

cs.CV · 2026-05-21 · unverdicted · novelty 6.0

Echo4DIR reconstructs continuous 4D cardiac geometry from sparse 2D echocardiography videos using implicit representations, epipolar feature fusion, self-supervised domain adaptation, and radial SDF alignment to achieve up to 98.35% Dice overlap.

The Learnability Gap in Medical Latent Diffusion

cs.CV · 2026-05-16 · unverdicted · novelty 6.0

Pretrained autoencoders in medical latent diffusion encode discriminative features well for reconstruction but structure their latent spaces in ways that hinder classifier learning, a gap that persists across architectures and is not closed by domain fine-tuning.

Towards Fine-Grained and Verifiable Concept Bottleneck Models

cs.LG · 2026-05-14 · unverdicted · novelty 6.0

A verifiable CBM framework grounds concepts in localized image patches, achieving comparable accuracy to standard CBMs on medical benchmarks while enabling direct inspection of concept correctness.

From Image Hashing to Scene Change Detection

cs.CV · 2026-05-12 · unverdicted · novelty 6.0

HashSCD is a patch-wise hashing method for unsupervised scene change detection and localization that operates directly in Hamming space with competitive performance and lower computational cost.

Counterfactual Stress Testing for Image Classification Models

cs.CV · 2026-05-11 · unverdicted · novelty 6.0

Counterfactual stress testing with causal generative models offers a more accurate proxy than simple perturbations for predicting medical image model performance under distribution shifts.

Efficient Logic Gate Networks for Video Copy Detection

cs.CV · 2026-04-23 · unverdicted · novelty 6.0

Logic Gate Networks produce compact Boolean-circuit descriptors for video copy detection that match or exceed prior accuracy at over 11k inferences per second and orders-of-magnitude smaller size.

LLM as a Tool, Not an Agent: Code-Mined Tree Transformations for Neural Architecture Search

cs.LG · 2026-04-17 · unverdicted · novelty 6.0

LLMasTool improves neural architecture search by evolving code-mined hierarchical trees with diversity-guided Bayesian planning and targeted LLM assistance, reporting gains of 0.69, 1.83, and 2.68 points on CIFAR-10, CIFAR-100, and ImageNet16-120.

ESCAPE: Episodic Spatial Memory and Adaptive Execution Policy for Long-Horizon Mobile Manipulation

cs.CV · 2026-04-15 · unverdicted · novelty 6.0

ESCAPE combines spatio-temporal fusion mapping for depth-free 3D memory with a memory-driven grounding module and adaptive execution policy to reach 65.09% success on ALFRED test-seen long-horizon mobile manipulation tasks.

GroundingAnomaly: Spatially-Grounded Diffusion for Few-Shot Anomaly Synthesis

cs.CV · 2026-04-09 · unverdicted · novelty 6.0

GroundingAnomaly uses a Spatial Conditioning Module and Gated Self-Attention in a frozen diffusion U-Net to synthesize spatially accurate few-shot anomalies, reaching SOTA on MVTec AD and VisA for detection, segmentation, and instance detection.

PRISM-CTG: A Foundation Model for Cardiotocography Analysis with Multi-View SSL

cs.LG · 2026-04-09 · unverdicted · novelty 6.0

PRISM-CTG is the first large-scale foundation model for cardiotocography that uses multi-view self-supervised learning on unlabeled data to learn transferable representations, outperforming baselines on seven downstream tasks with external validation.

Worst-Group Equalized Odds Regularization for Multi-Attribute Fair Medical Image Classification

cs.LG · 2026-05-19 · unverdicted · novelty 5.0

A worst-group equalized odds regularizer targets extreme subgroup deviations in true and false positive rates to improve multi-attribute fairness in medical imaging while preserving AUC.

Hybrid Quantum-MambaVision: A Quantum-enhanced State Space Model for Calibrated Mixed-type Wafer Defect Detection

cs.CV · 2026-05-13 · unverdicted · novelty 5.0

Hybrid Quantum-MambaVision combines Mamba SSM with a parameterized quantum adapter for improved multi-label classification and calibration on the imbalanced MixedWM38 wafer defect dataset.

FedSurrogate: Backdoor Defense in Federated Learning via Layer Criticality and Surrogate Replacement

cs.CR · 2026-05-11 · unverdicted · novelty 5.0

FedSurrogate defends federated learning against backdoors by clustering on security-critical layers and substituting malicious updates with benign surrogates, reporting false-positive rates below 10% and attack success below 2.1% under non-IID conditions.

Sketch and Text Synergy: Fusing Structural Contours and Descriptive Attributes for Fine-Grained Image Retrieval

cs.CV · 2026-04-17 · unverdicted · novelty 5.0

STBIR fuses sketches and text via curriculum robustness, category optimization, and staged alignment to outperform prior methods on a new fine-grained benchmark dataset.

QShield: Securing Neural Networks Against Adversarial Attacks using Quantum Circuits

cs.CR · 2026-04-13 · unverdicted · novelty 5.0

Hybrid quantum-classical models using structured entanglement keep high accuracy on MNIST, OrganAMNIST and CIFAR-10 while lowering adversarial attack success rates and raising the computational cost of generating attacks.

DBMF: A Dual-Branch Multimodal Framework for Out-of-Distribution Detection

cs.CV · 2026-04-09 · unverdicted · novelty 5.0

DBMF integrates scores from text-image and vision branches to improve out-of-distribution detection on endoscopic datasets by up to 24.84% over prior methods.

Clinical Cognition Alignment for Gastrointestinal Diagnosis with Multimodal LLMs

cs.CV · 2026-03-21 · unverdicted · novelty 5.0

CogAlign uses hierarchical supervised fine-tuning on clinical cognition data plus counterfactual RL to align MLLMs with expert diagnostic pathways and enforce causal lesion grounding for GI endoscopy diagnosis.

Accuracy Improvement of Cell Image Segmentation Using Feedback Former

cs.CV · 2024-08-23 · unverdicted · novelty 5.0

Feedback Former improves cell image segmentation accuracy by feeding detailed feature maps back from near the output to lower transformer layers, outperforming non-feedback baselines with lower computational cost on three datasets.

MsEdF: A Multi-stream Encoder-decoder Framework for Remote Sensing Image Captioning

cs.CV · 2025-02-13 · unverdicted · novelty 4.0

MsEdF combines two complementary image encoders for feature diversity and a stacked GRU decoder with element-wise aggregation to improve remote sensing image captioning on three benchmark datasets.

citing papers explorer

Showing 25 of 25 citing papers.

Can Vision Models Truly Forget? Mirage: Representation-Level Certification of Visual Unlearning cs.CV · 2026-05-19 · unverdicted · none · ref 16
Mirage auditing reveals that VFL unlearning methods passing output-level checks still retain substantial class structure in representations across multiple datasets and baselines.
How to Evaluate and Refine your CAM cs.CV · 2026-05-14 · unverdicted · none · ref 9
Introduces synthetic ground-truth dataset for CAM evaluation, proposes ARCC composite metric, and RefineCAM method that aggregates layers for higher-resolution maps outperforming baselines.
PromptDx: Differentiable Prompt Tuning for Multimodal In-Context Alzheimer's Diagnosis cs.CV · 2026-05-09 · unverdicted · none · ref 9
PromptDx adds a differentiable adapter to align multimodal data with a pre-trained TabPFN-style ICL engine, achieving strong Alzheimer's diagnosis performance with only 1% context samples.
A High-Resolution Landscape Dataset for Concept-Based XAI With Application to Species Distribution Models cs.CV · 2026-04-14 · unverdicted · none · ref 18
A new open-access landscape concept dataset enables the first application of Robust TCAV to deep learning species distribution models, validating predictions against expert knowledge and uncovering novel ecological associations for two aquatic insect groups.
MoCA3D: Monocular 3D Bounding Box Prediction in the Image Plane cs.CV · 2026-03-20 · unverdicted · none · ref 12
MoCA3D formulates monocular 3D box prediction as dense pixel-space tasks using corner heatmaps and depth maps, with a new PAG metric for image-plane evaluation.
Echo4DIR: 4D Implicit Heart Reconstruction from 2D Echocardiography Videos cs.CV · 2026-05-21 · unverdicted · none · ref 5
Echo4DIR reconstructs continuous 4D cardiac geometry from sparse 2D echocardiography videos using implicit representations, epipolar feature fusion, self-supervised domain adaptation, and radial SDF alignment to achieve up to 98.35% Dice overlap.
The Learnability Gap in Medical Latent Diffusion cs.CV · 2026-05-16 · unverdicted · none · ref 10
Pretrained autoencoders in medical latent diffusion encode discriminative features well for reconstruction but structure their latent spaces in ways that hinder classifier learning, a gap that persists across architectures and is not closed by domain fine-tuning.
Towards Fine-Grained and Verifiable Concept Bottleneck Models cs.LG · 2026-05-14 · unverdicted · none · ref 8
A verifiable CBM framework grounds concepts in localized image patches, achieving comparable accuracy to standard CBMs on medical benchmarks while enabling direct inspection of concept correctness.
From Image Hashing to Scene Change Detection cs.CV · 2026-05-12 · unverdicted · none · ref 6
HashSCD is a patch-wise hashing method for unsupervised scene change detection and localization that operates directly in Hamming space with competitive performance and lower computational cost.
Counterfactual Stress Testing for Image Classification Models cs.CV · 2026-05-11 · unverdicted · none · ref 10
Counterfactual stress testing with causal generative models offers a more accurate proxy than simple perturbations for predicting medical image model performance under distribution shifts.
Efficient Logic Gate Networks for Video Copy Detection cs.CV · 2026-04-23 · unverdicted · none · ref 9
Logic Gate Networks produce compact Boolean-circuit descriptors for video copy detection that match or exceed prior accuracy at over 11k inferences per second and orders-of-magnitude smaller size.
LLM as a Tool, Not an Agent: Code-Mined Tree Transformations for Neural Architecture Search cs.LG · 2026-04-17 · unverdicted · none · ref 22
LLMasTool improves neural architecture search by evolving code-mined hierarchical trees with diversity-guided Bayesian planning and targeted LLM assistance, reporting gains of 0.69, 1.83, and 2.68 points on CIFAR-10, CIFAR-100, and ImageNet16-120.
ESCAPE: Episodic Spatial Memory and Adaptive Execution Policy for Long-Horizon Mobile Manipulation cs.CV · 2026-04-15 · unverdicted · none · ref 9
ESCAPE combines spatio-temporal fusion mapping for depth-free 3D memory with a memory-driven grounding module and adaptive execution policy to reach 65.09% success on ALFRED test-seen long-horizon mobile manipulation tasks.
GroundingAnomaly: Spatially-Grounded Diffusion for Few-Shot Anomaly Synthesis cs.CV · 2026-04-09 · unverdicted · none · ref 12
GroundingAnomaly uses a Spatial Conditioning Module and Gated Self-Attention in a frozen diffusion U-Net to synthesize spatially accurate few-shot anomalies, reaching SOTA on MVTec AD and VisA for detection, segmentation, and instance detection.
PRISM-CTG: A Foundation Model for Cardiotocography Analysis with Multi-View SSL cs.LG · 2026-04-09 · unverdicted · none · ref 14
PRISM-CTG is the first large-scale foundation model for cardiotocography that uses multi-view self-supervised learning on unlabeled data to learn transferable representations, outperforming baselines on seven downstream tasks with external validation.
Worst-Group Equalized Odds Regularization for Multi-Attribute Fair Medical Image Classification cs.LG · 2026-05-19 · unverdicted · none · ref 8
A worst-group equalized odds regularizer targets extreme subgroup deviations in true and false positive rates to improve multi-attribute fairness in medical imaging while preserving AUC.
Hybrid Quantum-MambaVision: A Quantum-enhanced State Space Model for Calibrated Mixed-type Wafer Defect Detection cs.CV · 2026-05-13 · unverdicted · none · ref 11
Hybrid Quantum-MambaVision combines Mamba SSM with a parameterized quantum adapter for improved multi-label classification and calibration on the imbalanced MixedWM38 wafer defect dataset.
FedSurrogate: Backdoor Defense in Federated Learning via Layer Criticality and Surrogate Replacement cs.CR · 2026-05-11 · unverdicted · none · ref 10
FedSurrogate defends federated learning against backdoors by clustering on security-critical layers and substituting malicious updates with benign surrogates, reporting false-positive rates below 10% and attack success below 2.1% under non-IID conditions.
Sketch and Text Synergy: Fusing Structural Contours and Descriptive Attributes for Fine-Grained Image Retrieval cs.CV · 2026-04-17 · unverdicted · none · ref 13
STBIR fuses sketches and text via curriculum robustness, category optimization, and staged alignment to outperform prior methods on a new fine-grained benchmark dataset.
QShield: Securing Neural Networks Against Adversarial Attacks using Quantum Circuits cs.CR · 2026-04-13 · unverdicted · none · ref 18
Hybrid quantum-classical models using structured entanglement keep high accuracy on MNIST, OrganAMNIST and CIFAR-10 while lowering adversarial attack success rates and raising the computational cost of generating attacks.
DBMF: A Dual-Branch Multimodal Framework for Out-of-Distribution Detection cs.CV · 2026-04-09 · unverdicted · none · ref 9
DBMF integrates scores from text-image and vision branches to improve out-of-distribution detection on endoscopic datasets by up to 24.84% over prior methods.
Clinical Cognition Alignment for Gastrointestinal Diagnosis with Multimodal LLMs cs.CV · 2026-03-21 · unverdicted · none · ref 13
CogAlign uses hierarchical supervised fine-tuning on clinical cognition data plus counterfactual RL to align MLLMs with expert diagnostic pathways and enforce causal lesion grounding for GI endoscopy diagnosis.
Accuracy Improvement of Cell Image Segmentation Using Feedback Former cs.CV · 2024-08-23 · unverdicted · none · ref 14
Feedback Former improves cell image segmentation accuracy by feeding detailed feature maps back from near the output to lower transformer layers, outperforming non-feedback baselines with lower computational cost on three datasets.
MsEdF: A Multi-stream Encoder-decoder Framework for Remote Sensing Image Captioning cs.CV · 2025-02-13 · unverdicted · none · ref 6
MsEdF combines two complementary image encoders for feature diversity and a stacked GRU decoder with element-wise aggregation to improve remote sensing image captioning on three benchmark datasets.
Generalized SAM: Efficient Fine-Tuning of SAM for Variable Input Image Sizes cs.CV · 2024-08-22 · unverdicted · none · ref 14
GSAM applies random cropping to enable variable input sizes for efficient SAM fine-tuning, claiming lower compute with comparable or higher accuracy on varied datasets.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer