Introduces synthetic ground-truth dataset for CAM evaluation, proposes ARCC composite metric, and RefineCAM method that aggregates layers for higher-resolution maps outperforming baselines.
Title resolution pending
32 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
PromptDx adds a differentiable adapter to align multimodal data with a pre-trained TabPFN-style ICL engine, achieving strong Alzheimer's diagnosis performance with only 1% context samples.
A new open-access landscape concept dataset enables the first application of Robust TCAV to deep learning species distribution models, validating predictions against expert knowledge and uncovering novel ecological associations for two aquatic insect groups.
MoCA3D formulates monocular 3D box prediction as dense pixel-space tasks using corner heatmaps and depth maps, with a new PAG metric for image-plane evaluation.
Echo4DIR reconstructs continuous 4D cardiac geometry from sparse 2D echocardiography videos using implicit representations, epipolar feature fusion, self-supervised domain adaptation, and radial SDF alignment to achieve up to 98.35% Dice overlap.
Pretrained autoencoders in medical latent diffusion encode discriminative features well for reconstruction but structure their latent spaces in ways that hinder classifier learning, a gap that persists across architectures and is not closed by domain fine-tuning.
A verifiable CBM framework grounds concepts in localized image patches, achieving comparable accuracy to standard CBMs on medical benchmarks while enabling direct inspection of concept correctness.
HashSCD is a patch-wise hashing method for unsupervised scene change detection and localization that operates directly in Hamming space with competitive performance and lower computational cost.
Counterfactual stress testing with causal generative models offers a more accurate proxy than simple perturbations for predicting medical image model performance under distribution shifts.
Hyp2Former learns hierarchical semantic similarities in hyperbolic space among known categories so that unknown objects remain close to higher-level concepts and can be detected reliably.
An asymmetric multi-level distillation framework lets a student ViT approximate clean-image representations from distorted inputs alone, outperforming prior methods on classification under distortions.
Logic Gate Networks produce compact Boolean-circuit descriptors for video copy detection that match or exceed prior accuracy at over 11k inferences per second and orders-of-magnitude smaller size.
LLMasTool improves neural architecture search by evolving code-mined hierarchical trees with diversity-guided Bayesian planning and targeted LLM assistance, reporting gains of 0.69, 1.83, and 2.68 points on CIFAR-10, CIFAR-100, and ImageNet16-120.
ESCAPE combines spatio-temporal fusion mapping for depth-free 3D memory with a memory-driven grounding module and adaptive execution policy to reach 65.09% success on ALFRED test-seen long-horizon mobile manipulation tasks.
GroundingAnomaly uses a Spatial Conditioning Module and Gated Self-Attention in a frozen diffusion U-Net to synthesize spatially accurate few-shot anomalies, reaching SOTA on MVTec AD and VisA for detection, segmentation, and instance detection.
PRISM-CTG is the first large-scale foundation model for cardiotocography that uses multi-view self-supervised learning on unlabeled data to learn transferable representations, outperforming baselines on seven downstream tasks with external validation.
A worst-group equalized odds regularizer targets extreme subgroup deviations in true and false positive rates to improve multi-attribute fairness in medical imaging while preserving AUC.
Hybrid Quantum-MambaVision combines Mamba SSM with a parameterized quantum adapter for improved multi-label classification and calibration on the imbalanced MixedWM38 wafer defect dataset.
FedSurrogate defends federated learning against backdoors by clustering on security-critical layers and substituting malicious updates with benign surrogates, reporting false-positive rates below 10% and attack success below 2.1% under non-IID conditions.
STBIR fuses sketches and text via curriculum robustness, category optimization, and staged alignment to outperform prior methods on a new fine-grained benchmark dataset.
Hybrid quantum-classical models using structured entanglement keep high accuracy on MNIST, OrganAMNIST and CIFAR-10 while lowering adversarial attack success rates and raising the computational cost of generating attacks.
DBMF integrates scores from text-image and vision branches to improve out-of-distribution detection on endoscopic datasets by up to 24.84% over prior methods.
CogAlign uses hierarchical supervised fine-tuning on clinical cognition data plus counterfactual RL to align MLLMs with expert diagnostic pathways and enforce causal lesion grounding for GI endoscopy diagnosis.
Feedback Former improves cell image segmentation accuracy by feeding detailed feature maps back from near the output to lower transformer layers, outperforming non-feedback baselines with lower computational cost on three datasets.
citing papers explorer
-
From Image Hashing to Scene Change Detection
HashSCD is a patch-wise hashing method for unsupervised scene change detection and localization that operates directly in Hamming space with competitive performance and lower computational cost.
-
Counterfactual Stress Testing for Image Classification Models
Counterfactual stress testing with causal generative models offers a more accurate proxy than simple perturbations for predicting medical image model performance under distribution shifts.
-
FedSurrogate: Backdoor Defense in Federated Learning via Layer Criticality and Surrogate Replacement
FedSurrogate defends federated learning against backdoors by clustering on security-critical layers and substituting malicious updates with benign surrogates, reporting false-positive rates below 10% and attack success below 2.1% under non-IID conditions.