Recognition: unknown
FUN: A Focal U-Net Combining Reconstruction and Object Detection for Snapshot Spectral Imaging
Pith reviewed 2026-05-07 07:14 UTC · model grok-4.3
The pith
A single network jointly reconstructs hyperspectral images and detects objects with reduced resources.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central discovery is that a U-shaped network incorporating focal modulation can serve as a shared backbone for simultaneous HSI reconstruction and object detection, where the two tasks mutually enhance each other through joint training, resulting in state-of-the-art accuracy with substantially lower parameter counts and computational demands compared to prior methods.
What carries the argument
Focal modulation applied in a shared U-Net backbone for multi-task learning between spectral reconstruction and object detection.
If this is right
- Real-time object detection becomes feasible directly from snapshot HSIs without a separate reconstruction phase.
- The model requires 40% fewer parameters and 30% less computation than recent alternatives while maintaining top performance.
- The new dataset with 363 HSIs and 8712 annotations supports further development of joint methods.
- Edge deployment for hyperspectral sensing applications is more feasible due to the efficiency gains.
Where Pith is reading between the lines
- This joint training approach might extend to other paired tasks in imaging, such as denoising combined with segmentation.
- If focal modulation proves general, it could replace attention mechanisms in other U-Net variants for better efficiency across vision tasks.
- The architecture could support portable hyperspectral devices for real-world monitoring once optimized further for constrained hardware.
Load-bearing premise
Multi-task learning on a shared backbone with focal modulation will create positive interactions between the reconstruction and detection tasks.
What would settle it
If a two-stage approach of first reconstructing the HSI and then detecting objects separately achieves higher accuracy or lower total latency than the end-to-end FUN on the new dataset.
Figures
read the original abstract
Conventional push-broom hyperspectral imaging suffers from slow acquisition speeds, precluding real-time object detection; in contrast, snapshot spectral imaging enables instantaneous hyperspectral images (HSIs) capture, making real-time object detection feasible, yet its potential is often compromised by time-consuming post-capture reconstruction. To address this issue, we propose the Focal U-shaped Network (FUN), a novel end-to-end framework that jointly performs HSI reconstruction and object detection via multi-task learning. FUN employs a shared U-shaped backbone, where reconstruction provides underlying spectral information while detection guides semantic-aware priors learning, facilitating mutually beneficial task interaction. Crucially, we introduce focal modulation, an efficient alternative to self-attention that modulates spatial and spectral features while reducing quadratic computational complexity, enabling a self-attention-free architecture for joint reconstruction and detection. Furthermore, we contribute a new HSI object detection dataset with 8712 annotated objects across 363 HSIs to facilitate evaluation of the proposed method. Experiments demonstrate that FUN achieves state-of-the-art performance on both tasks, using 40% fewer parameters and 30% less computation than recent alternatives, making it promising for future real-time edge deployment. The code and datasets are available: https://github.com/ShawnDong98/FUN.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes FUN, a Focal U-shaped Network for joint hyperspectral image reconstruction and object detection in snapshot spectral imaging. It employs a shared U-Net backbone with focal modulation (replacing self-attention) to enable efficient multi-task learning where reconstruction supplies spectral details and detection supplies semantic priors. The authors also release a new dataset of 363 HSIs containing 8712 annotated objects and report state-of-the-art performance on both tasks together with 40% fewer parameters and 30% less computation than recent alternatives.
Significance. If the multi-task interaction and efficiency claims can be rigorously verified, the work would be significant for real-time edge deployment of snapshot hyperspectral systems, as it directly tackles the reconstruction bottleneck while maintaining detection accuracy. The public release of code and the new dataset is a clear positive that would facilitate follow-on research.
major comments (2)
- [Abstract and §4] Abstract and §4 (Experiments): The central claim that the shared backbone plus focal modulation produces 'mutually beneficial task interaction' and that 'detection guides semantic-aware priors learning' is load-bearing for the SOTA and efficiency assertions, yet no ablations are described that isolate joint training from single-task baselines, vary the loss-balancing coefficients, or test for negative transfer. Without these, the reported gains cannot be confidently attributed to the multi-task design rather than focal modulation alone or the new dataset.
- [§4] §4 (Experiments) and Table 1/2: The quantitative SOTA claims lack error bars, standard deviations across multiple runs, or statistical significance tests against baselines. This makes it impossible to determine whether the 40% parameter and 30% compute reductions are robust or sensitive to hyperparameter choices.
minor comments (2)
- [§3] §3 (Method): The description of focal modulation would benefit from a short equation or diagram showing how it modulates spatial-spectral features, as readers may not be familiar with the referenced prior work.
- [§4] §4: The new dataset is introduced without a table summarizing its statistics (e.g., number of classes, spectral bands, train/val/test splits), which would help readers assess its difficulty relative to existing HSI detection benchmarks.
Simulated Author's Rebuttal
We sincerely thank the referee for the constructive and detailed feedback on our manuscript. We have carefully reviewed each major comment and provide point-by-point responses below, outlining the revisions we will implement to strengthen the paper's claims regarding multi-task interaction and the robustness of the reported results.
read point-by-point responses
-
Referee: [Abstract and §4] Abstract and §4 (Experiments): The central claim that the shared backbone plus focal modulation produces 'mutually beneficial task interaction' and that 'detection guides semantic-aware priors learning' is load-bearing for the SOTA and efficiency assertions, yet no ablations are described that isolate joint training from single-task baselines, vary the loss-balancing coefficients, or test for negative transfer. Without these, the reported gains cannot be confidently attributed to the multi-task design rather than focal modulation alone or the new dataset.
Authors: We appreciate the referee's emphasis on rigorously isolating the contribution of multi-task learning. While the manuscript presents comparisons to recent single-task and multi-task baselines in Tables 1 and 2, we acknowledge that dedicated ablations isolating joint training effects were not included. In the revised manuscript, we will add a new ablation subsection in §4 that includes: (1) direct comparisons of the shared FUN backbone under joint training versus independently trained single-task reconstruction and detection models, (2) sweeps over loss-balancing coefficients (λ_rec and λ_det) to demonstrate robustness and optimal interaction, and (3) explicit checks confirming the absence of negative transfer. These additions will allow readers to attribute performance gains more confidently to the proposed multi-task design with focal modulation. revision: yes
-
Referee: [§4] §4 (Experiments) and Table 1/2: The quantitative SOTA claims lack error bars, standard deviations across multiple runs, or statistical significance tests against baselines. This makes it impossible to determine whether the 40% parameter and 30% compute reductions are robust or sensitive to hyperparameter choices.
Authors: We thank the referee for this important point on statistical rigor. The reported 40% parameter reduction and 30% lower computation are deterministic architectural metrics (derived from model parameter counts and FLOPs) and are therefore insensitive to random seeds or hyperparameter variation. For the task-specific metrics (PSNR, SSIM, mAP, etc.), we will rerun all experiments using at least three different random seeds and report mean ± standard deviation in the updated Tables 1 and 2. We will also add pairwise statistical significance tests (e.g., paired t-tests with p-values) against the primary baselines. These updates will be incorporated into the revised §4 and tables. revision: yes
Circularity Check
No circularity; empirical claims rest on architecture proposal and external benchmarks
full rationale
The paper introduces a novel multi-task U-Net architecture with focal modulation for joint HSI reconstruction and object detection, contributes a new dataset, and reports empirical SOTA results with efficiency gains. No derivation chain exists that reduces predictions or uniqueness claims to self-defined quantities, fitted parameters renamed as outputs, or load-bearing self-citations. Performance assertions are validated via comparisons to prior external methods rather than internal equations or ansatzes that presuppose the result. The multi-task interaction is presented as a design choice evaluated experimentally, not derived by construction.
Axiom & Free-Parameter Ledger
free parameters (2)
- Network weights and hyperparameters
- Task loss balancing coefficients
axioms (2)
- domain assumption Focal modulation effectively captures spatial and spectral dependencies as an alternative to self-attention
- domain assumption Joint multi-task optimization yields better results on both tasks than separate training
Reference graph
Works this paper leans on
-
[1]
Snapshot multispectral endomicroscopy,
Z. Meng, M. Qiao, J. Ma, Z. Yu, K. Xu, and X. Yuan, “Snapshot multispectral endomicroscopy,”Optics Letters, vol. 45, no. 14, pp. 3897– 3900, 2020
2020
-
[2]
Hyper-skin: a hyperspectral dataset for reconstructing facial skin-spectra from rgb images,
P. C. Ng, Z. Chi, Y . Verdie, J. Lu, and K. N. Plataniotis, “Hyper-skin: a hyperspectral dataset for reconstructing facial skin-spectra from rgb images,”Advances in Neural Information Processing Systems, vol. 36, 2024
2024
-
[3]
A novel spectral-spatial multi- scale network for hyperspectral image classification with the res2net block,
Z. Zhang, D. Liu, D. Gao, and G. Shi, “A novel spectral-spatial multi- scale network for hyperspectral image classification with the res2net block,”International Journal of Remote Sensing, vol. 43, no. 3, pp. 751–777, 2022
2022
-
[4]
No-reference hyperspectral image quality assessment via ranking feature learning,
Y . Li, Y . Dong, H. Li, D. Liu, F. Xue, and D. Gao, “No-reference hyperspectral image quality assessment via ranking feature learning,” Remote Sensing, vol. 16, no. 10, p. 1657, 2024
2024
-
[5]
3d imaging spectroscopy for measuring hyperspectral patterns on solid objects,
M. H. Kim, T. A. Harvey, D. S. Kittle, H. Rushmeier, J. Dorsey, R. O. Prum, and D. J. Brady, “3d imaging spectroscopy for measuring hyperspectral patterns on solid objects,”ACM Transactions on Graphics (TOG), vol. 31, no. 4, pp. 1–11, 2012
2012
-
[6]
Compressive coded aperture spectral imaging: An introduction,
G. R. Arce, D. J. Brady, L. Carin, H. Arguello, and D. S. Kittle, “Compressive coded aperture spectral imaging: An introduction,”IEEE Signal Processing Magazine, vol. 31, no. 1, pp. 105–115, 2013
2013
-
[7]
High- speed hyperspectral video acquisition with a dual-camera architecture,
L. Wang, Z. Xiong, D. Gao, G. Shi, W. Zeng, and F. Wu, “High- speed hyperspectral video acquisition with a dual-camera architecture,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 4942–4950
2015
-
[8]
U-net: Convolutional networks for biomedical image segmentation,
O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” inMedical image computing and computer-assisted intervention–MICCAI 2015: 18th international con- ference, Munich, Germany, October 5-9, 2015, proceedings, part III 18. Springer, 2015, pp. 234–241
2015
-
[9]
l-net: Reconstruct hy- perspectral images from a snapshot measurement,
X. Miao, X. Yuan, Y . Pu, and V . Athitsos, “l-net: Reconstruct hy- perspectral images from a snapshot measurement,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 4059–4069
2019
-
[10]
End-to-end low cost compressive spectral imaging with spatial-spectral self-attention,
Z. Meng, J. Ma, and X. Yuan, “End-to-end low cost compressive spectral imaging with spatial-spectral self-attention,” inEuropean conference on computer vision. Springer, 2020, pp. 187–204
2020
-
[11]
Mask-guided spectral-wise transformer for efficient hyperspectral image reconstruction,
Y . Cai, J. Lin, X. Hu, H. Wang, X. Yuan, Y . Zhang, R. Timofte, and L. Van Gool, “Mask-guided spectral-wise transformer for efficient hyperspectral image reconstruction,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 17 502–17 511
2022
-
[12]
Coarse-to-fine sparse transformer for hyperspectral image recon- struction,
——, “Coarse-to-fine sparse transformer for hyperspectral image recon- struction,” inEuropean Conference on Computer Vision. Springer, 2022, pp. 686–704. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 10
2022
-
[13]
Feature pyramid networks for object detection,
T.-Y . Lin, P. Doll´ar, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2117–2125
2017
-
[14]
Focal modulation networks,
J. Yang, C. Li, X. Dai, and J. Gao, “Focal modulation networks,” Advances in Neural Information Processing Systems, vol. 35, pp. 4203– 4217, 2022
2022
-
[15]
Degradation estimation recurrent neural network with local and non-local priors for compres- sive spectral imaging,
Y . Dong, D. Gao, Y . Li, G. Shi, and D. Liu, “Degradation estimation recurrent neural network with local and non-local priors for compres- sive spectral imaging,”IEEE Transactions on Geoscience and Remote Sensing, 2024
2024
-
[16]
Degradation-aware unfolding half-shuffle transformer for spectral compressive imaging,
Y . Cai, J. Lin, H. Wang, X. Yuan, H. Ding, Y . Zhang, R. Timofte, and L. V . Gool, “Degradation-aware unfolding half-shuffle transformer for spectral compressive imaging,”Advances in Neural Information Processing Systems, vol. 35, pp. 37 749–37 761, 2022
2022
-
[17]
Computational hy- perspectral imaging based on dimension-discriminative low-rank tensor recovery,
S. Zhang, L. Wang, Y . Fu, X. Zhong, and H. Huang, “Computational hy- perspectral imaging based on dimension-discriminative low-rank tensor recovery,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 10 183–10 192
2019
-
[18]
Hyperspectral compressive snapshot reconstruction via coupled low-rank subspace representation and self-supervised deep network,
Y . Chen, W. Lai, W. He, X.-L. Zhao, and J. Zeng, “Hyperspectral compressive snapshot reconstruction via coupled low-rank subspace representation and self-supervised deep network,”IEEE Transactions on Image Processing, 2024
2024
-
[19]
Spectral enhanced rectangle transformer for hyperspectral image denoising,
M. Li, J. Liu, Y . Fu, Y . Zhang, and D. Dou, “Spectral enhanced rectangle transformer for hyperspectral image denoising,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5805–5814
2023
-
[20]
Faster nonconvex low-rank matrix learning for image low-level and high-level vision: A unified framework,
H. Zhang, J. Yang, J. Qian, C. Gong, X. Ning, Z. Zha, and B. Wen, “Faster nonconvex low-rank matrix learning for image low-level and high-level vision: A unified framework,”Information Fusion, vol. 108, p. 102347, 2024
2024
-
[21]
Ef- ficient image classification via structured low-rank matrix factorization regression,
H. Zhang, J. Yang, J. Qian, G. Gao, X. Lan, Z. Zha, and B. Wen, “Ef- ficient image classification via structured low-rank matrix factorization regression,”IEEE Transactions on Information Forensics and Security, vol. 19, pp. 1496–1509, 2023
2023
-
[22]
Accelerated palm for nonconvex low-rank matrix recovery with theo- retical analysis,
H. Zhang, B. Wen, Z. Zha, B. Zhang, Y . Tang, G. Yu, and W. Du, “Accelerated palm for nonconvex low-rank matrix recovery with theo- retical analysis,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 4, pp. 2304–2317, 2023
2023
-
[23]
Efficient and effective nonconvex low-rank subspace clustering via svt- free operators,
H. Zhang, S. Li, J. Qiu, Y . Tang, J. Wen, Z. Zha, and B. Wen, “Efficient and effective nonconvex low-rank subspace clustering via svt- free operators,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 12, pp. 7515–7529, 2023
2023
-
[24]
Enhanced acceleration for generalized nonconvex low-rank matrix learning,
H. Zhang, J. Yang, W. Du, B. Zhang, Z. Zha, and B. Wen, “Enhanced acceleration for generalized nonconvex low-rank matrix learning,”Chi- nese Journal of Electronics, vol. 34, no. 1, pp. 98–113, 2025
2025
-
[25]
Low-rank tensor meets deep prior: Coupling model-driven and data-driven methods for hyperspectral image reconstruction,
Y . Chen, F. Yuan, W. Lai, J. Zeng, W. He, and Q. Huang, “Low-rank tensor meets deep prior: Coupling model-driven and data-driven methods for hyperspectral image reconstruction,”IEEE Transactions on Circuits and Systems for Video Technology, pp. 1–1, 2025
2025
-
[26]
Dual-camera design for coded aperture snapshot spectral imaging,
L. Wang, Z. Xiong, D. Gao, G. Shi, and F. Wu, “Dual-camera design for coded aperture snapshot spectral imaging,”Applied optics, vol. 54, no. 4, pp. 848–858, 2015
2015
-
[27]
A new twist: Two-step iterative shrinkage/thresholding algorithms for image restoration,
J. M. Bioucas-Dias and M. A. Figueiredo, “A new twist: Two-step iterative shrinkage/thresholding algorithms for image restoration,”IEEE Transactions on Image processing, vol. 16, no. 12, pp. 2992–3004, 2007
2007
-
[28]
Generalized alternating projection based total variation mini- mization for compressive sensing,
X. Yuan, “Generalized alternating projection based total variation mini- mization for compressive sensing,” in2016 IEEE International confer- ence on image processing (ICIP). IEEE, 2016, pp. 2539–2543
2016
-
[29]
Rank minimization for snapshot compressive imaging,
Y . Liu, X. Yuan, J. Suo, D. J. Brady, and Q. Dai, “Rank minimization for snapshot compressive imaging,”IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 12, pp. 2990–3006, 2018
2018
-
[30]
Combining low-rank and deep plug-and-play priors for snapshot compressive imaging,
Y . Chen, X. Gui, J. Zeng, X.-L. Zhao, and W. He, “Combining low-rank and deep plug-and-play priors for snapshot compressive imaging,”IEEE Transactions on Neural Networks and Learning Systems, 2023
2023
-
[31]
Plug-and-play algorithms for large- scale snapshot compressive imaging,
X. Yuan, Y . Liu, J. Suo, and Q. Dai, “Plug-and-play algorithms for large- scale snapshot compressive imaging,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 1447–1457
2020
-
[32]
Prior images guided generative au- toencoder model for dual-camera compressive spectral imaging,
Y . Chen, Y . Wang, and H. Zhang, “Prior images guided generative au- toencoder model for dual-camera compressive spectral imaging,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 9, pp. 8629–8643, 2024
2024
-
[33]
Deep gaussian scale mixture prior for spectral compressive imaging,
T. Huang, W. Dong, X. Yuan, J. Wu, and G. Shi, “Deep gaussian scale mixture prior for spectral compressive imaging,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 16 216–16 225
2021
-
[34]
Residual degradation learning unfolding framework with mixing priors across spectral and spatial for compressive spectral imaging,
Y . Dong, D. Gao, T. Qiu, Y . Li, M. Yang, and G. Shi, “Residual degradation learning unfolding framework with mixing priors across spectral and spatial for compressive spectral imaging,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 22 262–22 271
2023
-
[35]
Alternating direction unfolding with a cross spectral attention prior for dual-camera compres- sive hyperspectral imaging,
Y . Dong, D. Gao, D. Liu, Y . Liu, and G. Shi, “Alternating direction unfolding with a cross spectral attention prior for dual-camera compres- sive hyperspectral imaging,”IEEE Transactions on Image Processing, vol. 34, pp. 5325–5340, 2025
2025
-
[36]
Progressive content-aware coded hyperspectral snapshot compressive imaging,
X. Zhang, B. Chen, W. Zou, S. Liu, Y . Zhang, R. Xiong, and J. Zhang, “Progressive content-aware coded hyperspectral snapshot compressive imaging,”IEEE Transactions on Circuits and Systems for Video Tech- nology, vol. 34, no. 11, pp. 10 817–10 830, 2024
2024
-
[37]
Dual-domain feature fusion and multi-level memory-enhanced network for spectral compres- sive imaging,
Y . Ying, J. Wang, Y . Shi, N. Ling, and B. Yin, “Dual-domain feature fusion and multi-level memory-enhanced network for spectral compres- sive imaging,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 10, pp. 9562–9577, 2024
2024
-
[38]
Adaptive nonlocal sparse representation for dual-camera compressive hyperspectral imag- ing,
L. Wang, Z. Xiong, G. Shi, F. Wu, and W. Zeng, “Adaptive nonlocal sparse representation for dual-camera compressive hyperspectral imag- ing,”IEEE transactions on pattern analysis and machine intelligence, vol. 39, no. 10, pp. 2104–2111, 2016
2016
-
[39]
Exploring nonlocal group sparsity under transform learning for hy- perspectral image denoising,
Y . Chen, W. He, X.-L. Zhao, T.-Z. Huang, J. Zeng, and H. Lin, “Exploring nonlocal group sparsity under transform learning for hy- perspectral image denoising,”IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–18, 2022
2022
-
[40]
Fast large-scale hyperspectral image denoising via non-iterative low- rank subspace representation,
Y . Chen, J. Zeng, W. He, X.-L. Zhao, T.-X. Jiang, and Q. Huang, “Fast large-scale hyperspectral image denoising via non-iterative low- rank subspace representation,”IEEE Transactions on Geoscience and Remote Sensing, 2024
2024
-
[41]
Non-local means denoising,
A. Buades, B. Coll, and J.-M. Morel, “Non-local means denoising,” Image Processing On Line, vol. 1, pp. 208–212, 2011
2011
-
[42]
Thick cloud removal in multitemporal remote sensing images via low-rank regularized self-supervised network,
Y . Chen, M. Chen, W. He, J. Zeng, M. Huang, and Y .-B. Zheng, “Thick cloud removal in multitemporal remote sensing images via low-rank regularized self-supervised network,”IEEE Transactions on Geoscience and Remote Sensing, 2024
2024
-
[43]
You only look once: Unified, real-time object detection,
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779– 788
2016
-
[44]
Focal loss for dense object detection,
T.-Y . Ross and G. Doll ´ar, “Focal loss for dense object detection,” in proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2980–2988
2017
-
[45]
Faster r-cnn: Towards real-time object detection with region proposal networks,
S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,”IEEE transactions on pattern analysis and machine intelligence, vol. 39, no. 6, pp. 1137–1149, 2016
2016
-
[46]
Fcos: A simple and strong anchor-free object detector,
Z. Tian, C. Shen, H. Chen, and T. He, “Fcos: A simple and strong anchor-free object detector,”IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 4, pp. 1922–1933, 2020
1922
-
[47]
YOLOX: Exceeding YOLO Series in 2021
Z. Ge, “Yolox: Exceeding yolo series in 2021,”arXiv preprint arXiv:2107.08430, 2021
work page internal anchor Pith review arXiv 2021
-
[48]
End-to-end object detection with transformers,
N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” in European conference on computer vision. Springer, 2020, pp. 213– 229
2020
-
[49]
Deformable DETR: Deformable Transformers for End-to-End Object Detection
X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai, “Deformable detr: Deformable transformers for end-to-end object detection,”arXiv preprint arXiv:2010.04159, 2020
work page internal anchor Pith review arXiv 2010
-
[50]
Image-adaptive yolo for object detection in adverse weather conditions,
W. Liu, G. Ren, R. Yu, S. Guo, J. Zhu, and L. Zhang, “Image-adaptive yolo for object detection in adverse weather conditions,” inProceedings of the AAAI conference on artificial intelligence, vol. 36, no. 2, 2022, pp. 1792–1800
2022
-
[51]
Togethernet: Bridging image restoration and object detection together via dynamic enhancement learning,
Y . Wang, X. Yan, K. Zhang, L. Gong, H. Xie, F. L. Wang, and M. Wei, “Togethernet: Bridging image restoration and object detection together via dynamic enhancement learning,” inComputer Graphics Forum, vol. 41, no. 7. Wiley Online Library, 2022, pp. 465–476
2022
-
[52]
Image enhancement guided object detection in visually degraded scenes,
H. Liu, F. Jin, H. Zeng, H. Pu, and B. Fan, “Image enhancement guided object detection in visually degraded scenes,”IEEE transactions on neural networks and learning systems, 2023
2023
-
[53]
Dsnet: Joint semantic learning for object detection in inclement weather conditions,
S.-C. Huang, T.-H. Le, and D.-W. Jaw, “Dsnet: Joint semantic learning for object detection in inclement weather conditions,”IEEE transactions on pattern analysis and machine intelligence, vol. 43, no. 8, pp. 2623– 2633, 2020
2020
-
[54]
Denet: detection-driven enhancement network for object detection under adverse weather con- ditions,
Q. Qin, K. Chang, M. Huang, and G. Li, “Denet: detection-driven enhancement network for object detection under adverse weather con- ditions,” inProceedings of the Asian Conference on Computer Vision, 2022, pp. 2813–2829. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 11
2022
-
[55]
Fa-yolo: An improved yolo model for infrared occlusion object detection under confusing background,
S. Du, B. Zhang, P. Zhang, P. Xiang, and H. Xue, “Fa-yolo: An improved yolo model for infrared occlusion object detection under confusing background,”Wireless Communications and Mobile Computing, vol. 2021, no. 1, p. 1896029, 2021
2021
-
[56]
Joint-sparse- blocks and low-rank representation for hyperspectral unmixing,
J. Huang, T.-Z. Huang, L.-J. Deng, and X.-L. Zhao, “Joint-sparse- blocks and low-rank representation for hyperspectral unmixing,”IEEE Transactions on Geoscience and Remote Sensing, vol. 57, no. 4, pp. 2419–2438, 2018
2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.