Recognition: unknown
deSEO: Physics-Aware Dataset Creation for High-Resolution Satellite Image Shadow Removal
Pith reviewed 2026-05-08 01:14 UTC · model grok-4.3
The pith
deSEO builds the first paired dataset for removing shadows from high-resolution satellite images by aligning clear reference acquisitions with shadowed ones through geometric and physics constraints.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
deSEO derives the first reproducible geometry-aware paired dataset for satellite shadow removal from existing shadow detection data by selecting a minimally shadowed acquisition as weak reference, applying Jacobian-based orientation normalisation and LoFTR-RANSAC registration, and restricting supervision to per-pixel validity masks; this is paired with a DSM-aware deshadowing model using residual translation, perceptual objectives, and mask-constrained adversarial learning that converges where standard UAV-based architectures fail.
What carries the argument
The deSEO pipeline that selects a minimally shadowed reference tile and registers it to shadowed acquisitions via temporal filtering, Jacobian orientation normalisation, LoFTR-RANSAC alignment, and a per-pixel validity mask to generate reliable paired supervision.
If this is right
- Supervised models can now be trained directly on satellite viewpoint variability instead of relying on unpaired or weakly supervised formulations.
- Downstream tasks including land classification, object detection, and 3D reconstruction receive inputs with reduced shadow artifacts.
- The same selection and registration steps can generate additional paired data from other temporal satellite collections.
- DSM integration in the model improves handling of terrain-induced shadows compared with purely image-based approaches.
Where Pith is reading between the lines
- The method could be adapted to correct other illumination variations such as seasonal or atmospheric effects in multi-date satellite stacks.
- Extending the validity mask concept might support joint training across different satellite sensors with varying off-nadir angles.
- The paired data opens the possibility of benchmarking multiple shadow removal architectures under consistent satellite geometry conditions.
Load-bearing premise
That choosing a minimally shadowed image as reference and performing the described registration steps produces alignments accurate enough for pixel-level training despite leftover parallax and scene changes over time.
What would settle it
Quantitative comparison of shadow removal results on a held-out collection of satellite images that includes independently verified shadow-free acquisitions or dense shadow boundary annotations, measuring metrics such as structural similarity and perceptual error against the current baseline.
Figures
read the original abstract
Shadows cast by terrain and tall structures remain a major obstacle for high-resolution satellite image analysis, degrading classification, detection, and 3D reconstruction performance. Public resources offering geometry-consistent paired shadow/shadow-free satellite imagery are essentially missing, and most Earth-observation datasets are designed for shadow detection or 3D modelling rather than removal. Existing deep shadow-removal datasets either target ground-level or aerial scenes or rely on unpaired and weakly supervised formulations rather than explicit satellite pairs. We address this gap with deSEO, a geometry-aware and physics-informed methodology that, to the best of our knowledge, is the first to derive paired supervision for satellite shadow removal from the S-EO shadow detection dataset through a fully replicable pipeline. For each tile, deSEO selects a minimally shadowed acquisition as a weak reference and pairs it with shadowed counterparts using temporal and geometric filtering, Jacobian-based orientation normalisation, and LoFTR-RANSAC registration. A per-pixel validity mask restricts learning to reliably aligned regions, enabling supervision despite residual off-nadir parallax. In addition to this paired dataset, we develop a DSM-aware deshadowing model that combines residual translation, perceptual objectives, and mask-constrained adversarial learning. In contrast, a direct adaptation of a UAV-based SRNet/pix2pix architecture fails to converge under satellite viewpoint variability. Our model consistently reduces the visual impact of cast shadows across diverse illumination and viewing conditions, achieving improved structural and perceptual fidelity on held-out scenes. deSEO therefore provides the first reproducible, geometry-aware paired dataset and baseline for shadow removal in satellite Earth observation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces deSEO, a pipeline that derives paired shadow/shadow-free supervision for high-resolution satellite imagery from the public S-EO dataset. For each tile it selects a minimally shadowed acquisition as weak reference, applies Jacobian orientation normalisation and LoFTR-RANSAC registration, and restricts supervision to a per-pixel validity mask; a DSM-aware residual network is then trained with perceptual and adversarial losses. The authors claim this yields the first reproducible geometry-aware paired dataset and a baseline model that improves structural and perceptual fidelity on held-out scenes where direct adaptation of UAV shadow-removal architectures fails.
Significance. If the masked pairs prove sufficiently accurate, deSEO would supply the first publicly replicable supervised training resource for satellite shadow removal, addressing a documented gap between existing shadow-detection/3D datasets and removal tasks. The reproducible pipeline and DSM integration are concrete strengths that could accelerate follow-on work in Earth-observation restoration.
major comments (2)
- [Abstract and §4] Abstract and §4 (evaluation): the statements that the model 'consistently reduces the visual impact of cast shadows' and achieves 'improved structural and perceptual fidelity' are unsupported by any reported quantitative metrics, PSNR/SSIM values, error bars, ablation tables, or failure-case analysis. Without these numbers the central claim that the DSM-aware model outperforms adapted baselines cannot be assessed.
- [§3] §3 (dataset construction): the per-pixel validity mask is asserted to 'enable supervision despite residual off-nadir parallax,' yet no alignment-error statistics (e.g., mean pixel displacement, fraction of valid pixels per tile, or non-shadow region consistency checks) are supplied. Because the quality of the weak-reference pairs is load-bearing for the 'first reproducible geometry-aware paired dataset' claim, the absence of these diagnostics leaves the training-signal fidelity unverified.
minor comments (2)
- [§3] Notation for the Jacobian-based orientation normalisation and the exact form of the mask threshold should be defined with an equation or pseudocode for full replicability.
- [§3] The manuscript would benefit from a table listing the number of tiles, average valid-pixel fraction, and temporal separation statistics for the derived pairs.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback highlighting the need for stronger quantitative support and dataset validation. We address each major comment below and will revise the manuscript to incorporate the suggested additions.
read point-by-point responses
-
Referee: [Abstract and §4] Abstract and §4 (evaluation): the statements that the model 'consistently reduces the visual impact of cast shadows' and achieves 'improved structural and perceptual fidelity' are unsupported by any reported quantitative metrics, PSNR/SSIM values, error bars, ablation tables, or failure-case analysis. Without these numbers the central claim that the DSM-aware model outperforms adapted baselines cannot be assessed.
Authors: We agree that the current manuscript relies primarily on qualitative visual comparisons and textual descriptions of baseline non-convergence to support the claims of reduced shadow impact and improved fidelity. To enable direct assessment of the DSM-aware model's performance, the revised version will add a quantitative evaluation table in §4 reporting PSNR, SSIM, LPIPS, and additional perceptual metrics on held-out scenes, with error bars from repeated training runs, an ablation study isolating the DSM component, and expanded failure-case analysis including quantitative error maps. revision: yes
-
Referee: [§3] §3 (dataset construction): the per-pixel validity mask is asserted to 'enable supervision despite residual off-nadir parallax,' yet no alignment-error statistics (e.g., mean pixel displacement, fraction of valid pixels per tile, or non-shadow region consistency checks) are supplied. Because the quality of the weak-reference pairs is load-bearing for the 'first reproducible geometry-aware paired dataset' claim, the absence of these diagnostics leaves the training-signal fidelity unverified.
Authors: We acknowledge that aggregate alignment statistics are necessary to verify the fidelity of the derived pairs and the effectiveness of the validity mask. While §3 describes the temporal-geometric filtering, Jacobian normalisation, LoFTR-RANSAC registration, and mask construction, specific error metrics are not reported. In the revision we will add these diagnostics to §3, including mean post-registration pixel displacement, the fraction of valid pixels per tile, and consistency checks on non-shadow regions across the dataset. revision: yes
Circularity Check
No circularity: pipeline applies external tools to public data without self-referential reductions
full rationale
The paper describes a replicable pipeline that selects minimally-shadowed tiles from the public S-EO dataset, applies temporal/geometric filtering, Jacobian orientation normalisation, LoFTR-RANSAC registration, and a per-pixel validity mask to produce paired supervision. No equations, fitted parameters, or derivations are shown that reduce the output pairs or the DSM-aware deshadowing model (residual translation + perceptual + mask-constrained adversarial objectives) to the inputs by construction. The central claim of providing the 'first reproducible geometry-aware paired dataset' rests on the novelty of applying these standard external components to satellite shadow removal, which is independent of the target results and externally falsifiable via the cited public dataset and registration methods. No self-citations, uniqueness theorems, or ansatzes are invoked in a load-bearing way. This is a normal self-contained case.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Minimally shadowed acquisitions exist and can be reliably identified as weak references for each tile
- domain assumption Jacobian-based orientation normalisation and LoFTR-RANSAC registration produce alignment accurate enough for per-pixel supervision after masking
Reference graph
Works this paper leans on
-
[1]
S-EO: A Large-Scale Dataset for Geometry-Aware Shadow Detection in Remote Sensing Applications , doi =
Masquil, Elías and Marí, Roger and Ehret, Thibaud and Meinhardt-Llopis, Enric and Musé, Pablo and Facciolo, Gabriele , booktitle =. S-EO: A Large-Scale Dataset for Geometry-Aware Shadow Detection in Remote Sensing Applications , doi =
-
[2]
WSRD: A Novel Benchmark for High Resolution Image Shadow Removal , doi =
Vasluianu, Florin-Alexandru and Seizinger, Tim and Timofte, Radu , booktitle =. WSRD: A Novel Benchmark for High Resolution Image Shadow Removal , doi =
-
[3]
Deeply supervised convolutional neural network for shadow detection based on a novel aerial shadow imagery dataset , doi =
Luo, Shuang and Li, Huifang and Shen, Huanfeng , date =. Deeply supervised convolutional neural network for shadow detection based on a novel aerial shadow imagery dataset , doi =
-
[4]
Physics-Based Shadow Image Decomposition for Shadow Removal , doi =
Le, Hieu and Samaras, Dimitris , date =. Physics-Based Shadow Image Decomposition for Shadow Removal , doi =
-
[5]
An impervious surfaces extraction method based on optical, ascending and descending SAR remote sensing imagery in high-density urban core areas , doi =
Zhang, Aizhu and Han, Zheng and Sun, Genyun and Chen, Xiaolin and Cheng, Ji and Zhang, Honghsheng , date =. An impervious surfaces extraction method based on optical, ascending and descending SAR remote sensing imagery in high-density urban core areas , doi =
-
[6]
and Lee, Stephen J
Zhu, Xiao and Wang, Tiejun and Skidmore, Andrew K. and Lee, Stephen J. and Duporge, Isla , date =. Mitigating terrain shadows in very high-resolution satellite imagery for accurate evergreen conifer detection using bi-temporal image fusion , doi =
-
[7]
A review of research on remote sensing images shadow detection and application to building extraction , doi =
Dong, Xueyan and Cao, Jiannong and Zhao, Weiheng , date =. A review of research on remote sensing images shadow detection and application to building extraction , doi =
-
[8]
, date =
Dare, Paul M. , date =. Shadow analysis in high-resolution satellite imagery of urban areas , number =
-
[9]
Remote sensing and cast shadows in mountainous terrain , number =
Giles, Philip T , date =. Remote sensing and cast shadows in mountainous terrain , number =
-
[10]
A general variational framework considering cast shadows for the topographic correction of remote sensing imagery , doi =
Li, Huifang and Xu, Liming and Shen, Huanfeng and Zhang, Liangpei , date =. A general variational framework considering cast shadows for the topographic correction of remote sensing imagery , doi =
-
[11]
An Evolutionary Shadow Correction Network and a Benchmark UAV Dataset for Remote Sensing Images , doi =
Luo, Shuang and Li, Huifang and Li, Yiqiu and Shao, Chenglin and Shen, Huanfeng and Zhang, Liangpei , date =. An Evolutionary Shadow Correction Network and a Benchmark UAV Dataset for Remote Sensing Images , doi =
-
[12]
and Shechtman, Eli and Wang, Oliver , booktitle =
Zhang, Richard and Isola, Phillip and Efros, Alexei A. and Shechtman, Eli and Wang, Oliver , booktitle =. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , doi =
-
[13]
SGDR: Stochastic Gradient Descent with Warm Restarts , eprint =
Loshchilov, Ilya and Hutter, Frank , booktitle =. SGDR: Stochastic Gradient Descent with Warm Restarts , eprint =
-
[14]
Adam: A Method for Stochastic Optimization
Kingma, Diederik P. and Ba, Jimmy Lei , title =. 2014 , doi =. 1412.6980v9 , eprintclass =
work page internal anchor Pith review arXiv 2014
-
[15]
Perceptual Losses for Real-Time Style Transfer and Super-Resolution , doi =
Johnson, Justin and Alahi, Alexandre and Fei-Fei, Li , booktitle =. Perceptual Losses for Real-Time Style Transfer and Super-Resolution , doi =
-
[16]
Recreating Brightness From Remote Sensing Shadow Appearance , doi =
Wang, Qi and Chi, Kaichen and Jing, Wei and Yuan, Yuan , date =. Recreating Brightness From Remote Sensing Shadow Appearance , doi =
-
[17]
Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal , doi =
Wang, Jifeng and Li, Xiang and Yang, Jian , booktitle =. Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal , doi =
-
[18]
Qu, Liangqiong and Tian, Jiandong and He, Shengfeng and Tang, Yandong and Lau, Rynson W. H. , booktitle =. DeshadowNet: A Multi-context Embedding Deep Network for Shadow Removal , doi =
-
[19]
Self-Attention Generative Adversarial Networks , eprint =
Zhang, Han and Goodfellow, Ian and Metaxas, Dimitris and Odena, Augustus , booktitle =. Self-Attention Generative Adversarial Networks , eprint =
-
[20]
and Farhadi, N
Jovhari, N. and Farhadi, N. and Sedaghat, A. and Mohammadi, N. , date =. Performance evaluation of learning-based methods for multispectral satellite image matching , doi =
-
[21]
Spectral normalization for generative adversarial networks , eprint =
Miyato, Takeru and Kataoka, Toshiki and Koyama, Masanori and Yoshida, Yuichi , booktitle =. Spectral normalization for generative adversarial networks , eprint =
-
[22]
, booktitle =
Isola, Phillip and Zhu, Jun-Yan and Zhou, Tinghui and Efros, Alexei A. , booktitle =. Image-to-image translation with conditional adversarial networks , doi =
-
[23]
Enhanced Pix2pix Dehazing Network , doi =
Qu, Yanyun and Chen, Yizi and Huang, Jingying and Xie, Yuan , booktitle =. Enhanced Pix2pix Dehazing Network , doi =
-
[24]
Shadow removal method for high-resolution aerial remote sensing images based on region group matching , doi =
Guo, Mingqiang and Zhang, Haixue and Huang, Ying and Xie, Zhong and Wu, Liang and Zhang, Jiaming , date =. Shadow removal method for high-resolution aerial remote sensing images based on region group matching , doi =
-
[25]
, booktitle =
Potje, Guilherme and Cadar, Felipe and Araujo, André and Martins, Renato and Nascimento, Erickson R. , booktitle =. XFeat: Accelerated Features for Lightweight Image Matching , doi =
-
[26]
ORB: An efficient alternative to SIFT or SURF , doi =
Rublee, Ethan and Rabaud, Vincent and Konolige, Kurt and Bradski, Gary , booktitle =. ORB: An efficient alternative to SIFT or SURF , doi =
-
[27]
LoFTR: Detector-Free Local Feature Matching with Transformers , doi =
Sun, Jiaming and Shen, Zehong and Wang, Yuang and Bao, Hujun and Zhou, Xiaowei , booktitle =. LoFTR: Detector-Free Local Feature Matching with Transformers , doi =
-
[28]
and Bolles, Robert C
Fischler, Martin A. and Bolles, Robert C. , date =. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography , doi =
-
[29]
Revisiting Shadow Detection: A New Benchmark Dataset for Complex World , doi =
Hu, Xiaowei and Wang, Tianyu and Fu, Chi-Wing and Jiang, Yitong and Wang, Qiong and Heng, Pheng-Ann , date =. Revisiting Shadow Detection: A New Benchmark Dataset for Complex World , doi =
-
[30]
U-Net: Convolutional Networks for Biomedical Image Segmentation , doi =
Ronneberger, Olaf and Fischer, Philipp and Brox, Thomas , booktitle =. U-Net: Convolutional Networks for Biomedical Image Segmentation , doi =
-
[31]
Style-Guided Shadow Removal , doi =
Wan, Jin and Yin, Hui and Wu, Zhenyao and Wu, Xinyi and Liu, Yanting and Wang, Song , booktitle =. Style-Guided Shadow Removal , doi =
-
[32]
and Wang, Zhen and Smolley, Stephen Paul , booktitle =
Mao, Xudong and Li, Qing and Xie, Haoran and Lau, Raymond Y.K. and Wang, Zhen and Smolley, Stephen Paul , booktitle =. Least Squares Generative Adversarial Networks , doi =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.