Mirage auditing reveals that VFL unlearning methods passing output-level checks still retain substantial class structure in representations across multiple datasets and baselines.
Title resolution pending
25 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 25representative citing papers
Introduces synthetic ground-truth dataset for CAM evaluation, proposes ARCC composite metric, and RefineCAM method that aggregates layers for higher-resolution maps outperforming baselines.
PromptDx adds a differentiable adapter to align multimodal data with a pre-trained TabPFN-style ICL engine, achieving strong Alzheimer's diagnosis performance with only 1% context samples.
A new open-access landscape concept dataset enables the first application of Robust TCAV to deep learning species distribution models, validating predictions against expert knowledge and uncovering novel ecological associations for two aquatic insect groups.
MoCA3D formulates monocular 3D box prediction as dense pixel-space tasks using corner heatmaps and depth maps, with a new PAG metric for image-plane evaluation.
Echo4DIR reconstructs continuous 4D cardiac geometry from sparse 2D echocardiography videos using implicit representations, epipolar feature fusion, self-supervised domain adaptation, and radial SDF alignment to achieve up to 98.35% Dice overlap.
Pretrained autoencoders in medical latent diffusion encode discriminative features well for reconstruction but structure their latent spaces in ways that hinder classifier learning, a gap that persists across architectures and is not closed by domain fine-tuning.
A verifiable CBM framework grounds concepts in localized image patches, achieving comparable accuracy to standard CBMs on medical benchmarks while enabling direct inspection of concept correctness.
HashSCD is a patch-wise hashing method for unsupervised scene change detection and localization that operates directly in Hamming space with competitive performance and lower computational cost.
Counterfactual stress testing with causal generative models offers a more accurate proxy than simple perturbations for predicting medical image model performance under distribution shifts.
Logic Gate Networks produce compact Boolean-circuit descriptors for video copy detection that match or exceed prior accuracy at over 11k inferences per second and orders-of-magnitude smaller size.
LLMasTool improves neural architecture search by evolving code-mined hierarchical trees with diversity-guided Bayesian planning and targeted LLM assistance, reporting gains of 0.69, 1.83, and 2.68 points on CIFAR-10, CIFAR-100, and ImageNet16-120.
ESCAPE combines spatio-temporal fusion mapping for depth-free 3D memory with a memory-driven grounding module and adaptive execution policy to reach 65.09% success on ALFRED test-seen long-horizon mobile manipulation tasks.
GroundingAnomaly uses a Spatial Conditioning Module and Gated Self-Attention in a frozen diffusion U-Net to synthesize spatially accurate few-shot anomalies, reaching SOTA on MVTec AD and VisA for detection, segmentation, and instance detection.
PRISM-CTG is the first large-scale foundation model for cardiotocography that uses multi-view self-supervised learning on unlabeled data to learn transferable representations, outperforming baselines on seven downstream tasks with external validation.
A worst-group equalized odds regularizer targets extreme subgroup deviations in true and false positive rates to improve multi-attribute fairness in medical imaging while preserving AUC.
Hybrid Quantum-MambaVision combines Mamba SSM with a parameterized quantum adapter for improved multi-label classification and calibration on the imbalanced MixedWM38 wafer defect dataset.
FedSurrogate defends federated learning against backdoors by clustering on security-critical layers and substituting malicious updates with benign surrogates, reporting false-positive rates below 10% and attack success below 2.1% under non-IID conditions.
STBIR fuses sketches and text via curriculum robustness, category optimization, and staged alignment to outperform prior methods on a new fine-grained benchmark dataset.
Hybrid quantum-classical models using structured entanglement keep high accuracy on MNIST, OrganAMNIST and CIFAR-10 while lowering adversarial attack success rates and raising the computational cost of generating attacks.
DBMF integrates scores from text-image and vision branches to improve out-of-distribution detection on endoscopic datasets by up to 24.84% over prior methods.
CogAlign uses hierarchical supervised fine-tuning on clinical cognition data plus counterfactual RL to align MLLMs with expert diagnostic pathways and enforce causal lesion grounding for GI endoscopy diagnosis.
Feedback Former improves cell image segmentation accuracy by feeding detailed feature maps back from near the output to lower transformer layers, outperforming non-feedback baselines with lower computational cost on three datasets.
MsEdF combines two complementary image encoders for feature diversity and a stacked GRU decoder with element-wise aggregation to improve remote sensing image captioning on three benchmark datasets.
citing papers explorer
-
Can Vision Models Truly Forget? Mirage: Representation-Level Certification of Visual Unlearning
Mirage auditing reveals that VFL unlearning methods passing output-level checks still retain substantial class structure in representations across multiple datasets and baselines.
-
How to Evaluate and Refine your CAM
Introduces synthetic ground-truth dataset for CAM evaluation, proposes ARCC composite metric, and RefineCAM method that aggregates layers for higher-resolution maps outperforming baselines.
-
PromptDx: Differentiable Prompt Tuning for Multimodal In-Context Alzheimer's Diagnosis
PromptDx adds a differentiable adapter to align multimodal data with a pre-trained TabPFN-style ICL engine, achieving strong Alzheimer's diagnosis performance with only 1% context samples.
-
A High-Resolution Landscape Dataset for Concept-Based XAI With Application to Species Distribution Models
A new open-access landscape concept dataset enables the first application of Robust TCAV to deep learning species distribution models, validating predictions against expert knowledge and uncovering novel ecological associations for two aquatic insect groups.
-
MoCA3D: Monocular 3D Bounding Box Prediction in the Image Plane
MoCA3D formulates monocular 3D box prediction as dense pixel-space tasks using corner heatmaps and depth maps, with a new PAG metric for image-plane evaluation.
-
Echo4DIR: 4D Implicit Heart Reconstruction from 2D Echocardiography Videos
Echo4DIR reconstructs continuous 4D cardiac geometry from sparse 2D echocardiography videos using implicit representations, epipolar feature fusion, self-supervised domain adaptation, and radial SDF alignment to achieve up to 98.35% Dice overlap.
-
The Learnability Gap in Medical Latent Diffusion
Pretrained autoencoders in medical latent diffusion encode discriminative features well for reconstruction but structure their latent spaces in ways that hinder classifier learning, a gap that persists across architectures and is not closed by domain fine-tuning.
-
Towards Fine-Grained and Verifiable Concept Bottleneck Models
A verifiable CBM framework grounds concepts in localized image patches, achieving comparable accuracy to standard CBMs on medical benchmarks while enabling direct inspection of concept correctness.
-
From Image Hashing to Scene Change Detection
HashSCD is a patch-wise hashing method for unsupervised scene change detection and localization that operates directly in Hamming space with competitive performance and lower computational cost.
-
Counterfactual Stress Testing for Image Classification Models
Counterfactual stress testing with causal generative models offers a more accurate proxy than simple perturbations for predicting medical image model performance under distribution shifts.
-
Efficient Logic Gate Networks for Video Copy Detection
Logic Gate Networks produce compact Boolean-circuit descriptors for video copy detection that match or exceed prior accuracy at over 11k inferences per second and orders-of-magnitude smaller size.
-
LLM as a Tool, Not an Agent: Code-Mined Tree Transformations for Neural Architecture Search
LLMasTool improves neural architecture search by evolving code-mined hierarchical trees with diversity-guided Bayesian planning and targeted LLM assistance, reporting gains of 0.69, 1.83, and 2.68 points on CIFAR-10, CIFAR-100, and ImageNet16-120.
-
ESCAPE: Episodic Spatial Memory and Adaptive Execution Policy for Long-Horizon Mobile Manipulation
ESCAPE combines spatio-temporal fusion mapping for depth-free 3D memory with a memory-driven grounding module and adaptive execution policy to reach 65.09% success on ALFRED test-seen long-horizon mobile manipulation tasks.
-
GroundingAnomaly: Spatially-Grounded Diffusion for Few-Shot Anomaly Synthesis
GroundingAnomaly uses a Spatial Conditioning Module and Gated Self-Attention in a frozen diffusion U-Net to synthesize spatially accurate few-shot anomalies, reaching SOTA on MVTec AD and VisA for detection, segmentation, and instance detection.
-
PRISM-CTG: A Foundation Model for Cardiotocography Analysis with Multi-View SSL
PRISM-CTG is the first large-scale foundation model for cardiotocography that uses multi-view self-supervised learning on unlabeled data to learn transferable representations, outperforming baselines on seven downstream tasks with external validation.
-
Worst-Group Equalized Odds Regularization for Multi-Attribute Fair Medical Image Classification
A worst-group equalized odds regularizer targets extreme subgroup deviations in true and false positive rates to improve multi-attribute fairness in medical imaging while preserving AUC.
-
Hybrid Quantum-MambaVision: A Quantum-enhanced State Space Model for Calibrated Mixed-type Wafer Defect Detection
Hybrid Quantum-MambaVision combines Mamba SSM with a parameterized quantum adapter for improved multi-label classification and calibration on the imbalanced MixedWM38 wafer defect dataset.
-
FedSurrogate: Backdoor Defense in Federated Learning via Layer Criticality and Surrogate Replacement
FedSurrogate defends federated learning against backdoors by clustering on security-critical layers and substituting malicious updates with benign surrogates, reporting false-positive rates below 10% and attack success below 2.1% under non-IID conditions.
-
Sketch and Text Synergy: Fusing Structural Contours and Descriptive Attributes for Fine-Grained Image Retrieval
STBIR fuses sketches and text via curriculum robustness, category optimization, and staged alignment to outperform prior methods on a new fine-grained benchmark dataset.
-
QShield: Securing Neural Networks Against Adversarial Attacks using Quantum Circuits
Hybrid quantum-classical models using structured entanglement keep high accuracy on MNIST, OrganAMNIST and CIFAR-10 while lowering adversarial attack success rates and raising the computational cost of generating attacks.
-
DBMF: A Dual-Branch Multimodal Framework for Out-of-Distribution Detection
DBMF integrates scores from text-image and vision branches to improve out-of-distribution detection on endoscopic datasets by up to 24.84% over prior methods.
-
Clinical Cognition Alignment for Gastrointestinal Diagnosis with Multimodal LLMs
CogAlign uses hierarchical supervised fine-tuning on clinical cognition data plus counterfactual RL to align MLLMs with expert diagnostic pathways and enforce causal lesion grounding for GI endoscopy diagnosis.
-
Accuracy Improvement of Cell Image Segmentation Using Feedback Former
Feedback Former improves cell image segmentation accuracy by feeding detailed feature maps back from near the output to lower transformer layers, outperforming non-feedback baselines with lower computational cost on three datasets.
-
MsEdF: A Multi-stream Encoder-decoder Framework for Remote Sensing Image Captioning
MsEdF combines two complementary image encoders for feature diversity and a stacked GRU decoder with element-wise aggregation to improve remote sensing image captioning on three benchmark datasets.
-
Generalized SAM: Efficient Fine-Tuning of SAM for Variable Input Image Sizes
GSAM applies random cropping to enable variable input sizes for efficient SAM fine-tuning, claiming lower compute with comparable or higher accuracy on varied datasets.