pith. sign in

arxiv: 2606.02603 · v1 · pith:2VLL7DIJnew · submitted 2026-05-23 · 💻 cs.CV · cs.LG

COD10K-C: Benchmarking Robustness of Camouflaged Object Detection Under Natural Image Corruptions

Pith reviewed 2026-06-30 13:37 UTC · model grok-4.3

classification 💻 cs.CV cs.LG
keywords camouflaged object detectionrobustness benchmarkimage corruptionsCOD10K-CRobustCODLiteDice scoremotion blur
0
0 comments X

The pith

Camouflaged object detectors lose accuracy under image corruptions, but RobustCODLite retains 92.3 percent of its clean Dice score.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes COD10K-C, a benchmark extending the COD10K dataset with eight corruption types at five severity levels each to create forty test conditions and 81,040 evaluation pairs. It measures performance drops in three existing models and introduces RobustCODLite, a lightweight detector trained with corruption augmentation, a frequency-prior branch, and an uncertainty-consistency loss. RobustCODLite retains 92.3 percent of its clean Dice score under corruption while SINet-v2, ZoomNet, and PFNet retain 87.7, 84.8, and 84.1 percent respectively. This matters because real camera feeds contain blur, noise, weather, and compression effects absent from clean benchmarks, so robustness directly affects practical reliability.

Core claim

COD10K-C supplies 81,040 evaluation pairs across eight corruption types and five severity levels. All tested models exhibit clear performance drops, with motion blur and Gaussian blur producing the largest losses. RobustCODLite retains 92.3 percent of its clean Dice score under corruption, compared with 87.7 percent for SINet-v2, 84.8 percent for ZoomNet, and 84.1 percent for PFNet. On the hardest corruptions, RobustCODLite matches or outperforms models that score higher on clean data.

What carries the argument

RobustCODLite, which incorporates corruption augmentation during training, a frequency-prior branch, and an uncertainty-consistency loss.

If this is right

  • Motion blur and Gaussian blur produce the largest performance drops across models.
  • Brightness and fog corruptions cause smaller accuracy losses than blur or noise.
  • Training with corruption augmentation improves retention of Dice score on corrupted test images.
  • On the most severe corruptions, the lightweight RobustCODLite can equal or exceed heavier models that lead on clean data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar augmentation and frequency-prior techniques could transfer to other detection or segmentation tasks to improve tolerance to camera artifacts.
  • Future robustness benchmarks for object detection should include corruption suites as standard practice rather than optional add-ons.
  • The frequency-prior branch may offer a general way to emphasize structural cues that survive certain corruptions better than spatial features alone.

Load-bearing premise

The eight corruption types and five severity levels in COD10K-C are representative of the natural image corruptions that occur in real-world camouflaged object detection scenarios.

What would settle it

If a model trained only on clean images achieves Dice-score retention rates on COD10K-C comparable to RobustCODLite, the contribution of the proposed augmentation, frequency branch, and consistency loss would be called into question.

Figures

Figures reproduced from arXiv: 2606.02603 by Arafat Hossain Sayem.

Figure 1
Figure 1. Figure 1: Qualitative predictions of RobustCODLite on COD10K test images. Each row shows (left to right): input image, [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
read the original abstract

Camouflaged object detection has improved substantially, but most standard benchmarks evaluate models only on clean images. This is not realistic because real cameras often capture blur, sensor noise, weather effects, and compression artifacts. We present COD10K-C, a corruption robustness benchmark based on COD10K. It includes 8 corruption types and 5 severity levels, giving 40 conditions and 81,040 evaluation pairs in total. We evaluate three popular camouflaged object detection models, SINet-v2, PFNet, and ZoomNet, as well as a lightweight model called RobustCODLite. All models show clear performance drops on corrupted images. Motion blur and Gaussian blur cause the largest drops, with SINet-v2 losing 18.5 Dice points under motion blur. Brightness and fog are less harmful. RobustCODLite uses corruption augmentation, a frequency-prior branch, and an uncertainty-consistency loss. It retains 92.3% of its clean Dice score under corruption, compared with 87.7% for SINet-v2, 84.8% for ZoomNet, and 84.1% for PFNet. On the hardest corruptions, RobustCODLite matches or outperforms models that perform better on clean data. We will release the COD10K-C GitHub repository to support future research in robust camouflaged object detection.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces COD10K-C, a corruption robustness benchmark derived from COD10K by applying 8 corruption types (including motion blur, Gaussian blur, brightness, and fog) at 5 severity levels each, yielding 40 conditions and 81,040 image pairs. It evaluates three existing camouflaged object detection models (SINet-v2, PFNet, ZoomNet) and proposes RobustCODLite, which incorporates corruption augmentation during training, a frequency-prior branch, and an uncertainty-consistency loss. All models exhibit performance degradation under corruptions, with blur types causing the largest drops (e.g., SINet-v2 loses 18.5 Dice points under motion blur), while RobustCODLite retains 92.3% of its clean Dice score on average—higher than the 87.7%, 84.8%, and 84.1% retention rates of the baselines—and matches or exceeds stronger clean-data models on the hardest corruptions.

Significance. If the benchmark construction and evaluation protocol are sound, the work fills a clear gap in camouflaged object detection by quantifying robustness to realistic camera artifacts and environmental effects. The empirical retention metrics and the proposed RobustCODLite provide a concrete baseline and a practical starting point for future robust COD research. Releasing the GitHub repository supports reproducibility and community follow-up. The direct measurement approach (no circular derivations) strengthens the contribution relative to purely theoretical robustness claims.

major comments (2)
  1. [Methods] Methods section: The manuscript provides no details on the exact implementation of the 8 corruptions (e.g., parameters, libraries such as imagecorruptions or custom code, or how severity levels map to pixel-level changes). This is load-bearing for the central benchmark claim, as it prevents verification that the reported Dice drops (e.g., 18.5 points for SINet-v2 under motion blur) arise from the intended natural corruptions rather than implementation artifacts.
  2. [RobustCODLite and Experiments] RobustCODLite description and results: The frequency-prior branch and uncertainty-consistency loss are introduced without equations, architectural diagrams, or hyperparameter values. Without these, it is impossible to assess whether the reported 92.3% retention advantage is attributable to the claimed components or to other unstated training choices, undermining the comparison to SINet-v2, ZoomNet, and PFNet.
minor comments (2)
  1. [Abstract and Results] The abstract and results text report relative retention percentages but omit the absolute clean Dice scores for each model; adding these values would allow readers to judge whether the robustness gains come at the cost of clean performance.
  2. [Results] Table or figure presenting per-corruption results should explicitly state the number of images per condition to confirm the 81,040 total pairs.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment below and will revise the manuscript to incorporate the requested details.

read point-by-point responses
  1. Referee: [Methods] Methods section: The manuscript provides no details on the exact implementation of the 8 corruptions (e.g., parameters, libraries such as imagecorruptions or custom code, or how severity levels map to pixel-level changes). This is load-bearing for the central benchmark claim, as it prevents verification that the reported Dice drops (e.g., 18.5 points for SINet-v2 under motion blur) arise from the intended natural corruptions rather than implementation artifacts.

    Authors: We agree that the absence of implementation details for the corruptions limits reproducibility. In the revised manuscript we will add a dedicated subsection describing the exact libraries (including whether imagecorruptions or custom code was used), all parameter settings, and the precise mapping from severity level to pixel-level transformations for each of the eight corruption types. revision: yes

  2. Referee: [RobustCODLite and Experiments] RobustCODLite description and results: The frequency-prior branch and uncertainty-consistency loss are introduced without equations, architectural diagrams, or hyperparameter values. Without these, it is impossible to assess whether the reported 92.3% retention advantage is attributable to the claimed components or to other unstated training choices, undermining the comparison to SINet-v2, ZoomNet, and PFNet.

    Authors: We acknowledge that the current description of RobustCODLite is insufficiently detailed. The revised manuscript will include the mathematical formulations of the frequency-prior branch and uncertainty-consistency loss, an architectural diagram, and the complete set of hyperparameter values employed during training so that the contribution of each component can be evaluated. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper is an empirical benchmarking study that constructs COD10K-C from existing COD10K images by applying 8 corruption types at 5 severity levels and reports direct Dice-score measurements for baseline models and RobustCODLite. No derivation chain, equations, first-principles predictions, or fitted parameters are present; the central claim (relative retention of clean performance under corruption) is a straightforward empirical comparison on the new test set. The techniques used for RobustCODLite (augmentation, frequency branch, consistency loss) are described as standard and do not rely on self-citation chains or self-definitional reductions. This is the expected non-finding for a pure evaluation paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that the selected corruptions adequately represent real-world conditions for COD and that the new model components provide the observed robustness gains; no free parameters or invented physical entities are introduced.

axioms (1)
  • domain assumption Standard image corruption models apply directly to camouflaged object images without domain-specific adjustments.
    The benchmark applies 8 corruption types and 5 severity levels to COD10K without additional justification for their suitability to camouflaged scenes.

pith-pipeline@v0.9.1-grok · 5777 in / 1263 out tokens · 67641 ms · 2026-06-30T13:37:26.824006+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

17 extracted references · 3 canonical work pages

  1. [1]

    Fan, G.-P

    D.-P. Fan, G.-P. Ji, M.-M. Cheng, and L. Shao.Concealed Object Detection.IEEE Transactions on Pattern Analy- sis and Machine Intelligence, 44(10):6024–6042, 2022. doi:10.1109/TPAMI.2021.3085766. 6

  2. [2]

    Mei, G.-P

    H. Mei, G.-P. Ji, Z. Wei, X. Yang, X. Wei, and D.-P. Fan.Camouflaged Object Segmentation with Distraction Mining.In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021

  3. [3]

    Hendrycks and T

    D. Hendrycks and T. Dietterich.Benchmarking Neural Network Robustness to Common Corruptions and Pertur- bations.International Conference on Learning Represen- tations (ICLR), 2019

  4. [4]

    Ecker, Matthias Bethge, and Wieland Brendel

    C. Michaelis, B. Mitzkus, R. Geirhos, E. Rusak, O. Bring- mann, A. S. Ecker, M. Bethge, and W. Brendel.Bench- marking Robustness in Object Detection: Autonomous Driving When Winter Is Coming.arXiv:1907.07484, 2019

  5. [5]

    Kamann and C

    C. Kamann and C. Rother.Benchmarking the Robustness of Semantic Segmentation Models.In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2020

  6. [6]

    L. Kong, X. Xie, H. Hu, L. X. Ng, B. R. Cottereau, and W. T. Ooi.RoboDepth: Robust Out-of-Distribution Depth Estimation under Corruptions.Advances in Neural Infor- mation Processing Systems (NeurIPS), 2023

  7. [7]

    Y . Pang, X. Zhao, T.-Z. Xiang, L. Zhang, and H. Lu. Zoom In and Out: A Mixed-scale Triplet Network for Camouflaged Object Detection.In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022. arXiv:2203.02688

  8. [8]

    Fan, G.-P

    D.-P. Fan, G.-P. Ji, G. Sun, M.-M. Cheng, J. Shen, and L. Shao.Camouflaged Object Detection.In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2777–2787, 2020

  9. [9]

    H. Wang, C. Xiao, J. Kossaifi, Z. Yu, A. Anandkumar, and Z. Wang.AugMax: Adversarial Composition of Random Augmentations for Robust Training.Advances in Neural Information Processing Systems (NeurIPS), 2021

  10. [10]

    Hendrycks, S

    D. Hendrycks, S. Mu, E. D. Cubuk, B. Zoph, J. Gilmer, and B. Lakshminarayanan.The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021

  11. [11]

    T. Nair, D. Precup, D. L. Arnold, and T. Arbel.Explor- ing Uncertainty Measures in Deep Networks for Multiple Sclerosis Lesion Detection and Segmentation.Medical Image Analysis, 59:101557, 2020

  12. [12]

    Zhang, D.-P

    J. Zhang, D.-P. Fan, Y . Dai, S. Anwar, F. Saleh, S. Ali- akbarian, and N. Barnes.Uncertainty Inspired RGB-D Saliency Detection.IEEE Transactions on Pattern Analy- sis and Machine Intelligence, 44(9):4897–4914, 2022

  13. [13]

    Zhou, D.-P

    T. Zhou, D.-P. Fan, M.-M. Cheng, J. Shen, and L. Shao. RGB-D Salient Object Detection: A Survey.Computa- tional Visual Media, 7(1):37–69, 2021

  14. [14]

    D. Yin, R. G. Lopes, J. Shlens, E. D. Cubuk, and J. Gilmer. A Fourier Perspective on Model Robustness in Computer Vision.Advances in Neural Information Processing Sys- tems (NeurIPS), 2019

  15. [15]

    C. Guo, G. Pleiss, Y . Sun, and K. Q. Weinberger.On Calibration of Modern Neural Networks.International Conference on Machine Learning (ICML), 2017

  16. [16]

    Tan and Q

    M. Tan and Q. V . Le.EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.International Conference on Machine Learning (ICML), 2019

  17. [17]

    K. He, X. Zhang, S. Ren, and J. Sun.Deep Residual Learn- ing for Image Recognition.In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. 7