Dual-Integrated Low-Latency Single-Lens Infrared Computational Imaging for Object Detection
Pith reviewed 2026-05-22 06:57 UTC · model grok-4.3
The pith
PDI-Net merges reconstruction and detection in a single pipeline for low-latency infrared object detection.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
PDI-Net integrates infrared reconstruction with object detection by using a supervised U-Net only during training and a semi-U-Net encoder that shares multiscale features directly with a YOLO-based detector at inference time. A physics-aware large-small bridge (PALS-Bridge) uses field-dependent point spread function priors to adaptively modulate convolutional branches and bridge fidelity-oriented features with detection semantics. A physics-informed optical degradation simulation pipeline generates training data. On the M3FD benchmark under low-SNR conditions, the approach reduces inference time by 84.06 percent versus a pruned Rec+Det baseline while improving mAP@0.5:0.95 by 5.07 percent. A
What carries the argument
The physics-aware large-small bridge (PALS-Bridge), which modulates multiscale convolutional branches with field-dependent point spread function priors to adapt reconstruction features for detection without full image output.
If this is right
- Single-lens infrared cameras can weigh about 50 percent less than traditional multi-lens designs while supporting real-time detection.
- Inference latency drops enough for deployment on resource-constrained platforms without separate reconstruction steps.
- Detection accuracy holds or improves under low signal-to-noise conditions compared to pruned reconstruction-plus-detection pipelines.
- The method avoids reconstructing full images at test time by sharing encoder features directly with the detector.
Where Pith is reading between the lines
- The same dual-integration pattern could apply to other wavelength ranges or sensor types where reconstruction latency limits real-time use.
- Further tests with varied lens aberrations would show how far the PALS-Bridge priors generalize beyond the simulation pipeline.
- Hardware prototypes could reveal whether the reported weight savings translate to improved battery life or portability in field deployments.
Load-bearing premise
The physics-informed optical degradation simulation and field-dependent PSF priors in PALS-Bridge accurately represent real single-lens infrared degradations and preserve detection-critical information during feature adaptation.
What would settle it
Measure whether PDI-Net maintains its reported mAP gain and latency reduction when run on raw images from an actual single-lens infrared prototype camera under the same low-SNR conditions used in the M3FD tests.
Figures
read the original abstract
Computational imaging enables compact infrared systems, but deep-learning pipelines that combine image reconstruction and object detection often introduce substantial inference latency. Most existing acceleration strategies compress the reconstruction network while overlooking physical priors from the optical path, leaving a trade-off between accuracy and speed. We present Physics-aware Dual-Integrated Network (PDI-Net), a low-latency framework that integrates infrared reconstruction with object detection and further embeds optical priors into the learning process. PDI-Net uses a supervised U-Net during training, while a semi-U-Net encoder shares features directly with a YOLO-based detector during inference, avoiding full image reconstruction. To bridge the gap between fidelity-oriented reconstruction features and detection-oriented semantics, we introduce a physics-aware large-small bridge (PALS-Bridge), which uses field-dependent point spread function priors to adaptively modulate multiscale convolutional branches. A physics-informed optical degradation simulation pipeline is also developed for training and validation. The method is deployed on a single-lens infrared camera, reducing system weight by about 50% compared with traditional multi-lens designs. On the M3FD benchmark under low-SNR conditions, PDI-Net reduces inference time by 84.06% compared with the Rec+Det with pruning strategy while improving mAP@0.5:0.95 by 5.07%. These results demonstrate compact, low-latency computational infrared imaging for real-time object detection on resource-constrained platforms.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents PDI-Net, a physics-aware dual-integrated network for low-latency single-lens infrared computational imaging for object detection. It employs a supervised U-Net during training but switches to a semi-U-Net encoder that shares multiscale features directly with a YOLO-based detector at inference, bypassing full image reconstruction. Optical priors are embedded via the PALS-Bridge module, which uses field-dependent PSF priors to modulate large-small convolutional branches, supported by a physics-informed optical degradation simulation pipeline for training and validation. The system is deployed on a single-lens IR camera (claiming ~50% weight reduction vs. multi-lens designs). On the M3FD benchmark under low-SNR conditions, it reports an 84.06% inference-time reduction relative to a pruned Rec+Det baseline while improving mAP@0.5:0.95 by 5.07%.
Significance. If the simulation pipeline and PALS-Bridge successfully preserve detection-critical high-frequency cues under realistic single-lens IR degradations, the work would offer a practical route to compact, real-time computational IR systems on resource-constrained platforms. The explicit integration of optical priors into the feature-adaptation stage and the training-to-inference architectural split are strengths that could reduce latency without sacrificing accuracy in reconstruction-detection pipelines.
major comments (1)
- Abstract and Deployment Statement: The central quantitative claims (84.06% latency reduction and 5.07% mAP@0.5:0.95 gain on M3FD low-SNR) rest on the premise that the physics-informed optical degradation simulation pipeline plus field-dependent PSF priors in PALS-Bridge produce detection-ready features without full reconstruction. The text provides no evidence that real captured PSFs, image pairs, or hardware measurements from the deployed single-lens camera were used to calibrate or validate the simulation; if the modeled aberrations deviate from physical optics, the multiscale modulation may discard cues that the reported mAP improvement assumes are retained.
Simulated Author's Rebuttal
We thank the referee for their thorough review and constructive feedback on our manuscript. We address the major comment point by point below, providing clarifications and committing to revisions where appropriate to strengthen the presentation of our simulation pipeline and deployment claims.
read point-by-point responses
-
Referee: Abstract and Deployment Statement: The central quantitative claims (84.06% latency reduction and 5.07% mAP@0.5:0.95 gain on M3FD low-SNR) rest on the premise that the physics-informed optical degradation simulation pipeline plus field-dependent PSF priors in PALS-Bridge produce detection-ready features without full reconstruction. The text provides no evidence that real captured PSFs, image pairs, or hardware measurements from the deployed single-lens camera were used to calibrate or validate the simulation; if the modeled aberrations deviate from physical optics, the multiscale modulation may discard cues that the reported mAP improvement assumes are retained.
Authors: We appreciate the referee highlighting this important clarification needed in our presentation. The physics-informed optical degradation simulation pipeline relies on modeled field-dependent PSF priors generated from the optical design parameters of the single-lens infrared camera (using standard ray-tracing and diffraction models), rather than direct calibration against real captured PSFs or paired hardware measurements. This simulation-based approach is employed because obtaining large-scale, precisely registered real IR image pairs under varying low-SNR conditions with the exact single-lens setup is practically challenging and not always feasible for training deep networks. The PALS-Bridge uses these priors to adaptively modulate multiscale features, and all reported mAP and latency results are obtained by applying the simulated degradations to the M3FD benchmark for controlled, reproducible evaluation. The deployment claim in the abstract refers specifically to the physical single-lens camera hardware achieving the ~50% weight reduction, which was verified through system integration and weight measurements, independent of the end-to-end detection metrics. We agree that the manuscript should more explicitly distinguish between simulation for algorithmic validation and hardware for system-level benefits to avoid any implication of direct real-PSF calibration. In the revised manuscript, we will update the abstract, add a new subsection detailing the PSF prior generation process (including optical parameters used), and explicitly state the simulation-based nature of the performance evaluation. This revision will directly address concerns about potential deviation from physical optics and the retention of detection-critical cues. revision: yes
Circularity Check
No significant circularity detected
full rationale
The PDI-Net framework integrates a supervised U-Net for training with a semi-U-Net encoder sharing features directly to a YOLO-based detector at inference, augmented by the PALS-Bridge module that applies field-dependent PSF priors to modulate multiscale branches and a separate physics-informed optical degradation simulation pipeline. These components rely on standard, externally established architectures (U-Net, YOLO) and optical priors motivated outside the present work rather than any internal fitting or self-referential definition. Reported gains (84.06% latency reduction and 5.07% mAP improvement on M3FD low-SNR) are framed as empirical deployment results on a single-lens camera, not as quantities forced by construction from the equations or prior self-citations. The derivation chain therefore remains self-contained with independent content from the network topology and physics priors.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Field-dependent point spread functions from the single-lens optical path can be used to adaptively modulate multiscale features and bridge reconstruction-oriented and detection-oriented representations.
- domain assumption A physics-informed optical degradation simulation pipeline produces training data sufficiently representative of real single-lens infrared camera behavior.
invented entities (3)
-
PDI-Net
no independent evidence
-
PALS-Bridge
no independent evidence
-
physics-informed optical degradation simulation pipeline
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
PALS-Bridge … uses field-dependent point spread function priors to adaptively modulate multiscale convolutional branches … physics-informed optical degradation simulation pipeline
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
On the M3FD benchmark … reduces inference time by 84.06% … improving mAP@0.5:0.95 by 5.07%
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
L. Cai, X. Dong, K. Zhou, and X. Cao, “Exploring video denoising in thermal infrared imaging: Physics-inspired noise generator, dataset, and model,”IEEE Trans. Image Process., vol. 33, pp. 3839–3854, 2024
work page 2024
-
[2]
Z. Zhou, Y . Majeed, G. D. Naranjo, and E. M. Gambacorta, “Assessment for crop water stress with infrared thermal imagery in precision agri- culture: A review and future prospects for deep learning applications,” Computers and Electronics in Agriculture, vol. 182, p. 106019, 2021
work page 2021
-
[3]
Object detection from uav thermal infrared images and videos using yolo models,
C. Jiang, H. Ren, X. Ye, J. Zhu, H. Zeng, Y . Nan, M. Sun, X. Ren, and H. Huo, “Object detection from uav thermal infrared images and videos using yolo models,”International Journal of Applied Earth Observation and Geoinformation, vol. 112, p. 102912, 2022
work page 2022
-
[4]
Miniaturization of optical spectrometers,
Z. Yang, T. Albrow-Owen, W. Cai, and T. Hasan, “Miniaturization of optical spectrometers,”Science, vol. 371, no. 6528, p. eabe0722, 2021
work page 2021
-
[5]
Laskin,Basics of Optics on Imaging Quality and Aberrations
A. Laskin,Basics of Optics on Imaging Quality and Aberrations. Springer International Publishing, 2021, pp. 545–598
work page 2021
-
[6]
High-quality computational imaging through simple lenses,
F. Heide, M. Rouf, M. B. Hullin, B. Labitzke, W. Heidrich, and A. Kolb, “High-quality computational imaging through simple lenses,” ACM Trans. Graph., vol. 32, no. 5, p. 149, 2013
work page 2013
-
[7]
A. Bhandari, A. Kadambi, and R. Raskar,Computational Imaging. MIT Press, 2022
work page 2022
-
[8]
Research advances in simple and compact,
Y .-H. Liu, T.-X. Qin, Y .-C. Wang, X.-W. Kang, J. Liu, J.-C. Wu, and L.-C. Cao, “Research advances in simple and compact,”Acta Phys. Sin., vol. 72, no. 8, 2023
work page 2023
-
[9]
Lensless computational imaging through deep learning,
A. Sinha, J. Lee, S. Li, and G. Barbastathis, “Lensless computational imaging through deep learning,”Optica, vol. 4, no. 9, pp. 1117–1125, 2017
work page 2017
-
[10]
Computational imaging and artificial intelligence: The next revolution of mobile vision,
J. Suo, W. Zhang, J. Gong, X. Yuan, D. J. Brady, and Q. Dai, “Computational imaging and artificial intelligence: The next revolution of mobile vision,”Proc. IEEE, vol. 111, no. 12, pp. 1607–1639, 2023
work page 2023
-
[11]
End-to- end learned single lens design using improved wiener deconvolution,
R. Zhang, F. Tan, Q. Hou, Z. Li, Z. Sun, C. Yang, and X. Gao, “End-to- end learned single lens design using improved wiener deconvolution,” Opt. Lett., vol. 48, no. 3, pp. 522–525, 2023
work page 2023
-
[12]
L. Bian and Q. Dai,Computational Imaging and Sensing. Beijing: Post & Telecom Press, 2022
work page 2022
-
[13]
Computational optical imaging: An overview,
C. Zuo and Q. Chen, “Computational optical imaging: An overview,” Infrared Laser Eng., vol. 51, no. 2, p. 20220110, 2022
work page 2022
-
[14]
Learned rotationally symmetric diffractive achromat for full-spectrum computational imaging,
X. Dun, H. Ikoma, G. Wetzstein, Z. Wang, X. Cheng, and Y . Peng, “Learned rotationally symmetric diffractive achromat for full-spectrum computational imaging,”Optica, vol. 7, no. 8, pp. 913–922, 2020
work page 2020
-
[15]
Lightridge: an end-to-end agile design framework for diffractive optical neural networks,
Y . Li, R. Chen, M. Lou, B. Sensale-Rodriguez, W. Gao, and C. Yu, “Lightridge: an end-to-end agile design framework for diffractive optical neural networks,” inProc. ACM Int. Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), vol. 4, 2023, pp. 202–218
work page 2023
-
[16]
OpEnCam: Lensless optical encryption camera,
S. S. Khan, X. Yu, K. Mitra, M. Chandraker, and F. Pittaluga, “OpEnCam: Lensless optical encryption camera,”IEEE Trans. Comput. Imaging, vol. 10, pp. 1306–1316, 2024
work page 2024
-
[17]
Flatcam: Thin, lensless cameras using coded aperture and computation,
M. S. Asif, A. Ayremlou, A. Sankaranarayanan, A. Veeraraghavan, and R. G. Baraniuk, “Flatcam: Thin, lensless cameras using coded aperture and computation,”IEEE Trans. Comput. Imaging, vol. 3, no. 3, pp. 384– 397, 2016
work page 2016
-
[18]
Diffusercam: lensless single-exposure 3d imaging,
N. Antipa, G. Kuo, R. Heckel, B. Mildenhall, E. Bostan, R. Ng, and L. Waller, “Diffusercam: lensless single-exposure 3d imaging,”Optica, vol. 5, no. 1, pp. 1–9, 2017
work page 2017
-
[19]
Single image haze removal using dark channel prior,
K. He, J. Sun, and X. Tang, “Single image haze removal using dark channel prior,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 12, pp. 2341–2353, 2010
work page 2010
-
[20]
Snapshot spectral compressive imaging reconstruction using convolution and contextual transformer,
L. Wang, Z. Wu, Y . Zhong, and X. Yuan, “Snapshot spectral compressive imaging reconstruction using convolution and contextual transformer,” Photon. Res., vol. 10, no. 8, pp. 1848–1858, Aug 2022. [Online]. Available: https://opg.optica.org/prj/abstract.cfm?URI=prj-10-8-1848
work page 2022
-
[21]
Model compression and hardware acceleration for neural networks: A comprehensive survey,
L. Deng, G. Li, S. Han, L. Shi, and Y . Xie, “Model compression and hardware acceleration for neural networks: A comprehensive survey,” Proc. IEEE, vol. 108, no. 4, pp. 485–532, 2020
work page 2020
-
[22]
S. Han, H. Mao, and W. J. Dally, “Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding,”arXiv preprint arXiv:1510.00149, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[23]
Network quantization with element- wise gradient scaling,
J. Lee, D. Kim, and B. Ham, “Network quantization with element- wise gradient scaling,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 6448–6457. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 15
work page 2021
-
[24]
Quantization and training of neural networks for efficient integer-arithmetic-only inference,
B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko, “Quantization and training of neural networks for efficient integer-arithmetic-only inference,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 2704– 2713
work page 2018
-
[25]
Distilling the Knowledge in a Neural Network
G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,”arXiv preprint arXiv:1503.02531, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[26]
Efficient neural architecture search via parameters sharing,
H. Pham, M. Guan, B. Zoph, Q. Le, and J. Dean, “Efficient neural architecture search via parameters sharing,” inInternational conference on machine learning. PMLR, 2018, pp. 4095–4104
work page 2018
-
[27]
Y . Xing, X. Wang, X. Dun, J. Zhang, J. Yu, W. Huang, Z. Wang, and X. Cheng, “Real-time high-quality single-lens computational imaging via enhancing lens modulation transfer function consistency,”Opt. Express, vol. 33, no. 3, pp. 5179–5190, 2025
work page 2025
-
[28]
Physics-informed neural network enables high-frame-rate single-lens computational imaging,
Y . Xing, X. Wang, J. Zhang, X. Qian, D. Yang, X. Dun, Z. Wang, and X. Cheng, “Physics-informed neural network enables high-frame-rate single-lens computational imaging,”Chinese Optics Letters, vol. 23, no. 11, p. 121101, 12 2025. [Online]. Available: https://m.researching.cn/articles/OJcd8c96e5ba2a08d7
work page 2025
-
[29]
X. Qian, X. Wang, Y . Xing, G. Yang, X. Dun, Z. Wang, and X. Cheng, “Mwr-net: An edge-oriented lightweight framework for image restoration in single-lens infrared computational imaging,” Remote Sensing, vol. 17, no. 17, 2025. [Online]. Available: https: //www.mdpi.com/2072-4292/17/17/3005
work page 2025
-
[30]
Edge accelerated reconstruction using sensitivity analysis for single- lens computational imaging,
X. Wang, T. Feng, Y . Xing, Z. Zhao, X. Dun, Z. Wang, and X. Cheng, “Edge accelerated reconstruction using sensitivity analysis for single- lens computational imaging,”Adv. Imaging, vol. 2, no. 3, 2025
work page 2025
-
[31]
Rethinking image restoration for object detection,
S. Sun, W. Ren, T. Wang, and X. Cao, “Rethinking image restoration for object detection,”Adv. Neural Inf. Process. Syst., vol. 35, pp. 4461– 4474, 2022
work page 2022
-
[32]
Distinctive image features from scale-invariant keypoints,
D. G. Low, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vis., vol. 60, no. 2, pp. 91–110, 2004
work page 2004
-
[33]
Histograms of oriented gradients for human detection,
N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” inProc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), vol. 1, 2005, pp. 886–893
work page 2005
-
[34]
Rich feature hierarchies for accurate object detection and semantic segmentation,
R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” inProc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2014, pp. 580–587
work page 2014
-
[35]
R. Girshick, “Fast R-CNN,” inProc. IEEE Int. Conf. Comput. Vis. (ICCV), 2015, pp. 1440–1448
work page 2015
-
[36]
When Image Denoising Meets High-Level Vision Tasks: A Deep Learning Approach
D. Liu, B. Wen, X. Liu, Z. Wang, and T. S. Huang, “When image denoising meets high-level vision tasks: A deep learning approach,” arXiv preprint arXiv:1706.04284, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[37]
I. Goodfellowet al., “Generative adversarial nets,”Proc. NIPS, pp. 2672–2680, 2014
work page 2014
-
[38]
Cross-resolution semi-supervised adversarial learning for pansharpening,
G. Yang, K. Zhang, F. Zhang, J. Wang, and J. Sun, “Cross-resolution semi-supervised adversarial learning for pansharpening,”IEEE Trans. Geosci. Remote Sens., vol. 61, pp. 1–17, 2023
work page 2023
-
[39]
Denoising prior driven deep neural network for image restoration,
W. Dong, P. Wang, W. Yin, G. Shi, F. Wu, and X. Lu, “Denoising prior driven deep neural network for image restoration,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 10, pp. 2305–2318, 2018
work page 2018
-
[40]
Rethinking deep image prior for denoising,
Y . Jo, S. Y . Chun, and J. Choi, “Rethinking deep image prior for denoising,” inProc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2021, pp. 5087–5096
work page 2021
-
[41]
Image deblurring by exploring in-depth properties of transformer,
P. Liang, J. Jiang, X. Liu, and J. Ma, “Image deblurring by exploring in-depth properties of transformer,”IEEE Trans. Neural Netw. Learn. Syst., 2024
work page 2024
-
[42]
Dehazenet: An end-to-end system for single image haze removal,
B. Cai, X. Xu, K. Jia, C. Qing, and D. Tao, “Dehazenet: An end-to-end system for single image haze removal,”IEEE Trans. Image Process., vol. 25, no. 11, pp. 5187–5198, 2016
work page 2016
-
[43]
Beyond dehazing: Learning intrinsic hazy robustness for aerial object detection,
Q. Hu, Y . Zhang, R. Zhang, F. Xu, and W. Yang, “Beyond dehazing: Learning intrinsic hazy robustness for aerial object detection,”IEEE Trans. Geosci. Remote Sens., 2024
work page 2024
-
[44]
From rain generation to rain removal,
H. Wang, Z. Yue, Q. Xie, Q. Zhao, Y . Zheng, and D. Meng, “From rain generation to rain removal,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 14 791–14 801
work page 2021
-
[45]
Compressive hyperspectral target detection with restricted distribution property,
Q. Yang, X. Wang, D. Wang, B. Yu, Y . Zhou, and S. Qiao, “Compressive hyperspectral target detection with restricted distribution property,”IEEE Trans. Geosci. Remote Sens., 2024
work page 2024
-
[46]
J. Yuan, X. Zou, H. Xia, T. Liu, and F. Wu, “Bi-branch multiscale feature joint network for orsi salient object detection in adverse weather conditions,”IEEE Trans. Geosci. Remote Sens., 2024
work page 2024
-
[47]
Aod-net: All-in-one dehazing network,
B. Li, X. Peng, Z. Wang, J. Xu, and D. Feng, “Aod-net: All-in-one dehazing network,” inProc. IEEE Int. Conf. Comput. Vis. (ICCV), 2017, pp. 4770–4778
work page 2017
-
[48]
Detection- friendly dehazing: Object detection in real-world hazy scenes,
C. Li, H. Zhou, Y . Liu, C. Yang, Y . Xie, Z. Li, and L. Zhu, “Detection- friendly dehazing: Object detection in real-world hazy scenes,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 7, pp. 8284–8295, 2023
work page 2023
-
[49]
Darkvisionnet: Low-light imaging via rgb-nir fusion with deep inconsistency prior,
S. Jin, B. Yu, M. Jing, Y . Zhou, J. Liang, and R. Ji, “Darkvisionnet: Low-light imaging via rgb-nir fusion with deep inconsistency prior,” in Proc. AAAI Conf. Artif. Intell., vol. 36, no. 1, 2022, pp. 1104–1112
work page 2022
-
[50]
Learning deep multiscale local dissimilarity prior for pansharpening,
K. Zhang, G. Yang, F. Zhang, W. Wan, M. Zhou, J. Sun, and H. Zhang, “Learning deep multiscale local dissimilarity prior for pansharpening,” IEEE Trans. Geosci. Remote Sens., vol. 61, pp. 1–15, 2023
work page 2023
-
[51]
Multitask aet with orthogonal tangent regularity for dark object detection,
Z. Cui, G.-J. Qi, L. Gu, S. You, Z. Zhang, and T. Harada, “Multitask aet with orthogonal tangent regularity for dark object detection,” inProc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2021, pp. 2553–2562
work page 2021
-
[52]
Multiscale domain adaptive yolo for cross- domain object detection,
M. Hnewa and H. Radha, “Multiscale domain adaptive yolo for cross- domain object detection,” inProc. IEEE Int. Conf. Image Process. (ICIP). IEEE, 2021, pp. 3323–3327
work page 2021
-
[53]
Restorex-ai: A contrastive approach towards guiding image restoration via explainable ai systems,
A. Marathe, P. Jain, R. Walambe, and K. Kotecha, “Restorex-ai: A contrastive approach towards guiding image restoration via explainable ai systems,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2022, pp. 3030–3039
work page 2022
-
[54]
H. Fu, Y . Wang, F. Dai, and L. Hong, “Blind focusing for computational microwave imaging with metasurface aperture based on sparse bayesian learning,”IEEE Trans. Geosci. Remote Sens., 2024
work page 2024
-
[55]
Connecting image denoising and high-level vision tasks via deep learning,
D. Liu, B. Wen, J. Jiao, X. Liu, Z. Wang, and T. S. Huang, “Connecting image denoising and high-level vision tasks via deep learning,”IEEE Trans. Image Process., vol. 29, pp. 3695–3706, 2020
work page 2020
-
[56]
You only look once: Unified, real-time object detection,
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” inProc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 779–788
work page 2016
-
[57]
YOLOX: Exceeding YOLO Series in 2021
Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “Yolox: Exceeding yolo series in 2021,”arXiv preprint arXiv:2107.08430, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[58]
Yolov10: Real-time end-to-end object detection,
A. Wang, H. Chen, L. Liu, K. Chen, Z. Lin, J. Hanet al., “Yolov10: Real-time end-to-end object detection,”Adv. Neural Inf. Process. Syst., vol. 37, pp. 107 984–108 011, 2024
work page 2024
-
[59]
J. Liu, X. Fan, Z. Huang, G. Wu, R. Liu, W. Zhong, and Z. Luo, “Target- aware dual adversarial learning and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2022, pp. 5802–5811
work page 2022
-
[60]
Grad-cam: Visual explanations from deep networks via gradient-based localization,
R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-cam: Visual explanations from deep networks via gradient-based localization,” inProc. IEEE Int. Conf. Comput. Vis. (ICCV), 2017, pp. 618–626
work page 2017
-
[61]
Image-adaptive yolo for object detection in adverse weather conditions,
W. Liu, G. Ren, R. Yu, S. Guo, J. Zhu, and L. Zhang, “Image-adaptive yolo for object detection in adverse weather conditions,” inProceedings of the AAAI conference on artificial intelligence, vol. 36, no. 2, 2022, pp. 1792–1800
work page 2022
-
[62]
Esod: Efficient small object detection on high-resolution images,
K. Liu, Z. Fu, S. Jin, Z. Chen, F. Zhou, R. Jiang, Y . Chen, and J. Ye, “Esod: Efficient small object detection on high-resolution images,”IEEE Transactions on Image Processing, vol. 34, pp. 183–195, 2025
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.