A self-supervised method pretrains an encoder on eight PSP images per view to learn generalizable subsurface scattering representations that transfer to relighting and dense footprint reconstruction on unseen complex objects.
hub Mixed citations
U-Net: Convolutional Networks for Biomedical Image Segmentation
Mixed citation behavior. Most common role is background (43%).
abstract
There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .
hub tools
citation-role summary
citation-polarity summary
claims ledger
- abstract There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segme
co-cited works
representative citing papers
A U-Net surrogate with multigroup attention pooling is trained on OpenMC sensitivity data and combined with gradient optimization to generate grid-based critical experiment geometries that achieve c_k values up to 0.97757 for HALEU fuel validation.
The work demonstrates that multi-tracer field-level SBI on galaxy and HI maps yields 2-7 times better constraints on Omega_m and sigma_8 than single-tracer or summary-statistic approaches, with 3D maps performing best.
LatentHDR generates structurally consistent panoramic HDR images by producing one scene latent with a diffusion backbone then deterministically mapping it to multiple exposure latents via a lightweight conditional head.
EchoXFlow is a new dataset of 37,125 beamspace echocardiography recordings with separable modalities, Doppler data, ECG, and clinical annotations that enables acquisition-aware learning not possible with standard scan-converted videos.
Influpaint uses generative diffusion models on image-encoded influenza data to produce realistic and diverse epidemic trajectories that match leading ensemble methods in accuracy.
VitaminP uses paired H&E-mIF data to train a model that transfers molecular boundary information, enabling accurate whole-cell segmentation directly from routine H&E histology across 34 cancer types.
A modified DCGAN with an auxiliary discriminator using the membrane factor generates stable, previously unseen funicular shells optimized for pure compression in three dimensions.
A U-Net-based ML pipeline reconstructs the complete phase field and quantized vortex charges in 2D Bose-Einstein condensates from density snapshots alone, using synthetic training data from projected Gross-Pitaevskii simulations.
Dual Triangle Attention achieves effective bidirectional attention with built-in positional inductive bias via dual triangular masks, outperforming standard bidirectional attention on position-sensitive tasks and showing strong masked language modeling results with or without positional embeddings.
Defines diffusion processes on implicit data manifolds via proximity-graph approximations to the infinitesimal generator and carré-du-champ operator, proves convergence in law to the continuous manifold process, and provides an Euler-Maruyama integrator validated on synthetic and MNIST manifolds.
A CNN-based discrete diffusion method refines sparse contours from segmentation masks using simplified denoising steps and minimal post-processing, outperforming baselines on small medical and environmental datasets while running 3.5 times faster.
A diffusion model trained on real radio galaxy images reconstructs high-fidelity interferometric observations from VLA, EHT, and ALMA simulations and outperforms CLEAN on gridded visibilities.
SemanticBridge provides a new 3D dataset for bridge component segmentation and quantifies sensor-induced domain gaps that drop model performance by up to 11.4% mIoU.
Standard visual diffusion models operating in pixel space can approximate solutions to the inscribed square, Steiner tree, and simple polygon problems.
A U-Net GAN reconstructs CMB T and E maps from Planck-like simulations with foregrounds and systematics, achieving under 1% error outside the Galactic region and demonstrating first-time correction for non-circular beams and asymmetric scans.
SinkSAM-Net uses topographic priors and SAM with coordinate-wise bounding box jittering to create pseudo-labels for iterative self-supervised training of an EfficientNetV2-UNet, reaching about 95% of fully supervised performance on sinkhole datasets.
X-Mind proposes an efficient internal visual chain-of-thought using compressed BEV sketches and recurrent block diffusion to embed predictive world models into end-to-end driving policies.
21cmEMUv3 emulates the cylindrical 21cm power spectrum via score-based diffusion and six other 21cmFAST observables via LSTM networks at sub-percent accuracy, then uses the emulator to infer a lower limit on soft-band X-ray luminosity from HERA data.
A multimodal 3D foundation model pretrained on LSM volumes via masked reconstruction and image-text alignment enables improved few-shot segmentation, classification, and deblurring.
Mask R-CNN with ResNet-50 pre-trained on MethaneAIR and fine-tuned on MethaneSAT, plus physics-informed postprocessing, yields instance-level precision 0.60/recall 0.98 at baseline, improving to 0.71/0.94 and 0.92/0.70 in two operational modes.
Normalizing flows enable all-order QED corrections in lattice scalar QED in 2-4 dimensions with reduced variance and transferability from small to large lattices.
REPA-P aligns intermediate representations in diffusion models with physical states using first-principles PDE residuals to accelerate convergence and boost out-of-distribution robustness on PDE tasks.
SegRAG is a training-free retrieval-augmented framework that extracts class-specific point prompts from a filtered DINOv3 feature bank to boost SAM3 semantic segmentation performance on standard and agricultural benchmarks.
citing papers explorer
-
Field-level multi-tracers simulation-based inference of cosmological parameters from 3D maps
The work demonstrates that multi-tracer field-level SBI on galaxy and HI maps yields 2-7 times better constraints on Omega_m and sigma_8 than single-tracer or summary-statistic approaches, with 3D maps performing best.
-
21cmEMUv3: a hybrid diffusion-LSTM emulator of 21cmFAST summary observables
21cmEMUv3 emulates the cylindrical 21cm power spectrum via score-based diffusion and six other 21cmFAST observables via LSTM networks at sub-percent accuracy, then uses the emulator to infer a lower limit on soft-band X-ray luminosity from HERA data.
-
MG-NECOLA: A Field-Level Emulator for $f(R)$ Gravity and Massive Neutrino Cosmologies
A field-level CNN emulator converts MG-PICOLA runs into near N-body accuracy for f(R) gravity and neutrino cosmologies, achieving sub-percent errors on power spectra and bispectra while generalizing beyond its training set.
-
Machine Learning Techniques for Astrophysics and Cosmology: Lyman-$\alpha$ forest
Review of machine learning applications for analyzing Lyman-alpha forest observations to probe cosmology, reionization, and dark matter.