AuraMask produces 40 aesthetic anti-facial recognition filters that match or exceed prior adversarial effectiveness and achieve significantly higher user acceptance in a 630-person study.
hub Mixed citations
Attention U-Net: Learning Where to Look for the Pancreas
Mixed citation behavior. Most common role is background (56%).
abstract
We propose a novel attention gate (AG) model for medical imaging that automatically learns to focus on target structures of varying shapes and sizes. Models trained with AGs implicitly learn to suppress irrelevant regions in an input image while highlighting salient features useful for a specific task. This enables us to eliminate the necessity of using explicit external tissue/organ localisation modules of cascaded convolutional neural networks (CNNs). AGs can be easily integrated into standard CNN architectures such as the U-Net model with minimal computational overhead while increasing the model sensitivity and prediction accuracy. The proposed Attention U-Net architecture is evaluated on two large CT abdominal datasets for multi-class image segmentation. Experimental results show that AGs consistently improve the prediction performance of U-Net across different datasets and training sizes while preserving computational efficiency. The code for the proposed architecture is publicly available.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
TopoU-Net is a rank-path U-Net for combinatorial complexes that encodes by lifting cochains upward along incidences, decodes by transporting downward, and merges via skip connections at matched ranks.
XAttnRes introduces cross-stage attention residuals that maintain a global feature history and selectively aggregate prior representations, improving medical image segmentation and performing on par with baselines even without skip connections.
GDLA delivers state-of-the-art accuracy on CT, MRI, ultrasound and dermoscopy segmentation benchmarks while keeping linear O(N) complexity in a PVT encoder-decoder.
Variational Regularization imposes an adaptive information bottleneck on noisy intermediate features in DP3-UNet and DP3-DiT policies, consistently raising task success rates on RoboTwin2.0, Adroit, and MetaWorld while achieving new state-of-the-art results.
S2M-Net achieves state-of-the-art Dice scores on 16 medical datasets across 8 modalities using a 4.7M-parameter spectral-spatial mixer and morphology-aware adaptive loss, outperforming transformers with 3.5-6x fewer parameters.
StruMPL is a multi-task dense regression model that jointly addresses disjoint partial supervision, MNAR labels, and inter-task physical constraints for improved forest biomass estimation from Earth observation.
A spectral vision transformer achieves equitable or superior performance with fewer parameters than standard ViTs, CNNs, and other models by using spectral projections for tokenization in limited-data medical imaging.
A frequency-enhanced Vision Transformer with FDSA, FGMLP, WAFF, and FCSB modules delivers superior volumetric medical image segmentation performance and efficiency over prior state-of-the-art methods.
ESICA delivers state-of-the-art accuracy on a five-modality 3D medical segmentation benchmark while offering a compact variant with far fewer parameters.
Recoverability maps use synthetic sweeps of viewing angles and artifacts to quantify the recoverable fraction of parameter space for license plate restoration, with the best model succeeding on 93% and geometry setting the limit over architecture choice.
SPD improves SAM segmentation robustness to noisy prompts by learning anatomical saliency priors, distilling consensus prompts from adjacent slices, and enforcing pairwise slice consistency.
SemBugger achieves polymorphic backdoors in semantic communication via graded-intensity trigger poisoning and hierarchical loss, plus a noise-based defense with a theoretical efficacy bound.
CDSA-Net decouples vascular structure extraction and background restoration in coronary DSA via hierarchical geometric priors and adaptive noise modeling to eliminate artifacts while preserving tissue fidelity.
GCNV-Net achieves state-of-the-art accuracy on multiple 3D medical segmentation benchmarks while cutting FLOPs by 56% and inference latency by 68% through dynamic nonvoid voxelization and geometric attention.
The paper defines the Conformal Hallucination Estimation Metric (CHEM) that localizes hallucination-prone regions in image reconstruction models via multiscale representations and distribution-free conformal regression.
NTRM combines CNNs with tissue-level graph neural networks to model inter-tissue relationships, delivering 4.9% to 31.25% higher Dice scores than prior methods on a non-melanoma skin cancer histology segmentation benchmark.
SAMRI fine-tunes only the mask decoder of SAM on 1.1 million MRI slices from 30 datasets to reach mean DSC 0.87 on 47 targets and strong zero-shot performance.
GalCatDiff applies category embeddings and a novel Astro-RAB block inside diffusion models to produce galaxy images whose color and size distributions match observations more closely than prior generative approaches.
St-EDNet recovers sharp images from misaligned blurry intensity images and event streams by performing coarse cross-modal stereo alignment followed by fine bidirectional feature reconstruction.
M²SNet uses intra- and inter-layer multi-scale subtraction units plus a training-free LossNet to generate difference features that reduce redundancy in decoder fusion for medical segmentation.
MHMamba combines a U-Net with multi-head Mamba, channel calibration, and adaptive skip fusion to improve 3D brain tumor segmentation accuracy and small-lesion sensitivity on BraTS datasets while retaining linear complexity.
A pipeline uses Mask2Former flood masks and DEMs to compute a single water surface elevation then derives local depths under hydrostatic equilibrium.
A masked-diffusion pretrained convolutional model outperforms ViT pathology foundation models on cell-level dense prediction tasks in histology.
citing papers explorer
-
Learning Parallax for Stereo Event-based Motion Deblurring
St-EDNet recovers sharp images from misaligned blurry intensity images and event streams by performing coarse cross-modal stereo alignment followed by fine bidirectional feature reconstruction.
-
M$^{2}$SNet: Multi-scale in Multi-scale Subtraction Network for Medical Image Segmentation
M²SNet uses intra- and inter-layer multi-scale subtraction units plus a training-free LossNet to generate difference features that reduce redundancy in decoder fusion for medical segmentation.