SeamCam: Quantifying Seamless Camouflage via Multi-Cue Visual Detectability

Abolfazl Meyarian; Amin Karimi Monsefi; Anuj Karpatne; Cheng Zhang; Mridul Khurana; Pouyan Navard; Rajiv Ramnath; Shuheng Wang; Wei-Lun Chao

arxiv: 2605.16515 · v1 · pith:ZN6DH75Pnew · submitted 2026-05-15 · 💻 cs.CV · cs.LG

SeamCam: Quantifying Seamless Camouflage via Multi-Cue Visual Detectability

Amin Karimi Monsefi , Abolfazl Meyarian , Mridul Khurana , Shuheng Wang , Pouyan Navard , Cheng Zhang , Anuj Karpatne , Wei-Lun Chao

show 1 more author

Rajiv Ramnath

This is my paper

Pith reviewed 2026-05-20 18:22 UTC · model grok-4.3

classification 💻 cs.CV cs.LG

keywords seamless camouflagecamouflage quantificationvisual detectabilityobject detection proposalssegmentation masksdiffusion model optimizationhuman judgment agreement

0 comments

The pith

SeamCam quantifies seamless camouflage as one minus the highest IoU recoverable from category-conditioned detection proposals and their segmentation masks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a quantitative metric for seamless camouflage by treating it as a localization task where the animal must remain hard to find even when its category is known. SeamCam generates detection proposals tuned to the target species, extracts their segmentation masks, and computes the union that overlaps the ground-truth mask most strongly. The score is defined as one minus this maximum overlap, so higher values indicate better blending. A sympathetic reader would care because existing evaluations of camouflage rely on subjective ratings or uncontrolled images, making it difficult to compare methods or train generators systematically. The approach was tested against human choices and applied to optimize a diffusion model for creating camouflage.

Core claim

The central claim is that seamless camouflage strength equals one minus the maximum intersection-over-union between the true animal mask and the union of segmentation masks obtained from a pool of category-conditioned object detection proposals. This produces a score that aligned with human judgments of detection difficulty in 78.82 percent of 2,390 two-alternative forced-choice trials involving 94 participants, exceeding prior methods by roughly 25 percent. The same score is then used as a preference signal for direct preference optimization to fine-tune a diffusion-based inpainting model, and the work introduces the CamFG-1.5k dataset of 1,521 fully visible animal images to enable clean,Oc

What carries the argument

SeamCam, the metric computed as one minus the maximum IoU between the ground-truth mask and the union of segmentation masks from category-conditioned detection proposals.

If this is right

Camouflage can be compared and ranked using an automated procedure instead of repeated human studies.
The score supplies an explicit objective for training generative models to produce stronger camouflage via preference optimization.
Benchmarking becomes possible on datasets that control for occlusion by starting from fully visible animals.
New camouflage generation techniques can be evaluated reproducibly against the same localization-based criterion.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same localization-shortfall idea could be tested on non-animal concealment tasks such as hiding objects in cluttered scenes.
Incorporating motion or multi-frame information might strengthen alignment with how humans actually search for camouflaged targets.
Running the metric across multiple detector families would show how much the scores depend on the underlying proposal mechanism.

Load-bearing premise

The maximum IoU recoverable from a finite set of category-conditioned detection proposals and masks acts as a faithful proxy for human visual detectability of seamless camouflage.

What would settle it

Human observers rating a set of images as easy to detect when those images receive high SeamCam scores (or the reverse) under a different detector architecture or proposal ranking would falsify the claim that the metric tracks human perception.

Figures

Figures reproduced from arXiv: 2605.16515 by Abolfazl Meyarian, Amin Karimi Monsefi, Anuj Karpatne, Cheng Zhang, Mridul Khurana, Pouyan Navard, Rajiv Ramnath, Shuheng Wang, Wei-Lun Chao.

**Figure 1.** Figure 1: SeamCam vs. CamOT. SeamCam produces consistent and accurate camouflage difficulty scores across diverse scenarios, whereas CamOT exhibits notable inconsistencies. As illustrated by the polar bear — where superficial color and lighting similarity between subject and background causes CamOT to erroneously overestimate camouflage effectiveness despite the subject being plainly visible — CamOT assigns a disp… view at source ↗

**Figure 2.** Figure 2: Quality comparison between CamFG-1.5K vs. existing datasets. Existing datasets often include subjects that are cropped, or partially camouflaged, leading to biased evaluation of camouflage models. In contrast, CamFG-1.5K features clearly visible animals with minimal obstructions, enabling unbiased model assessment. Camouflage is fundamentally a problem of perception rather than appearance alone. Biologic… view at source ↗

**Figure 3.** Figure 3: Overview of SeamCam framework. Given an image and species name, we generate category-conditioned detection proposals via GroundingDINO, apply semantic and confidence gating, and obtain segmentation masks from SAM-2. We then evaluate all proposal subsets, computing IoU between each subset’s mask union and the ground truth. The maximum achievable IoU defines detectability D; the camouflage score is 1 − D. … view at source ↗

**Figure 4.** Figure 4: SeamCam-based sample selection for Direct Preference Optimiza [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Per-species accuracy comparison between SeamCam vs. CamOT [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

**Figure 6.** Figure 6: Camouflage image generation using SeamCam-based DPO [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗

read the original abstract

Animals are described as effectively camouflaged when they blend seamlessly with their surrounding, yet no standardized quantitative measure of this seamlessness exists. We address this gap by framing camouflage evaluation as a visual localization problem: a well-camouflaged animal is one that remains difficult to detect even when its category is known. We introduce SeamCam (Seamless Camouflage), a metric that quantifies how detectable an animal is from the available visual evidence. Given an image and a target species, SeamCam generates category-conditioned detection proposals, extracts segmentation masks, and identifies the subset whose collective union yields the highest IoU with the ground-truth mask. The SeamCam score is one minus this maximum recoverable localization signal, where a higher score indicates stronger camouflage (i.e., lower detectability). In a human two-alternative forced-choice study with 94 participants and 2,390 comparisons, SeamCam achieves 78.82% agreement with human camouflage difficulty judgments, outperforming state-of-the-art by about 25%. We then demonstrate SeamCam's utility as a preference signal for Direct Preference Optimization (DPO) to fine-tune a diffusion-based inpainting model for camouflage generation. This offers an affordable training approach with an objective explicitly suited for camouflage generation, unlike typical diffusion models. To support rigorous benchmarking, we further introduce CamFG-1.5k, a curated dataset of 1,521 high-resolution images in which animals are fully visible prior to camouflage generation, enabling unbiased evaluation by controlling for occlusion artifacts present in existing datasets. https://7amin.github.io/SeamCam/

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SeamCam gives a localization-based score for camouflage detectability that tracks human judgments at 79 percent, but the metric's dependence on a specific detector's proposals is the part that still needs checking.

read the letter

SeamCam turns the problem of measuring seamless camouflage into a localization task. Given an image and a target species, it runs category-conditioned detection proposals, pulls segmentation masks from them, finds the union that maximizes IoU with the ground-truth mask, and reports one minus that value. Higher scores mean the animal is harder to pin down. The paper shows this score agrees with human two-alternative forced-choice judgments 78.82 percent of the time across 94 participants and 2,390 comparisons, beating earlier metrics by about 25 percent. They also feed the score into Direct Preference Optimization to fine-tune a diffusion inpainting model and release CamFG-1.5k, a set of 1,521 high-resolution images where animals begin fully visible before any camouflage is added.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces SeamCam, a metric that quantifies seamless camouflage by framing detection difficulty as a localization task: given an image and target category, it generates category-conditioned detection proposals, extracts their segmentation masks, and computes the maximum IoU recoverable by their union against a provided ground-truth mask; the SeamCam score is then defined as one minus this maximum IoU. The central empirical result is a two-alternative forced-choice human study (94 participants, 2,390 comparisons) reporting 78.82% agreement between SeamCam scores and human camouflage-difficulty judgments, outperforming prior methods by roughly 25%. The authors also release the CamFG-1.5k dataset of 1,521 high-resolution images and demonstrate use of the metric as a preference signal for Direct Preference Optimization (DPO) of a diffusion inpainting model.

Significance. If the max-IoU proxy is shown to be robust rather than detector-specific, SeamCam would supply the first standardized, quantitative measure of seamless camouflage, directly addressing a long-standing gap in perceptual evaluation. The human study supplies direct empirical support for the correlation claim, the CamFG-1.5k dataset removes occlusion confounds present in prior collections, and the DPO application illustrates a concrete downstream use case. These contributions would be of clear interest to the computer-vision community working on camouflage, perceptual metrics, and generative modeling.

major comments (2)

[Human Evaluation and Results] The headline correlation (78.82% human agreement) rests on the assumption that the highest IoU recoverable from a finite set of category-conditioned detection proposals is a faithful proxy for human visual detectability. No ablation is reported that replaces the proposal generator or alters its conditioning mechanism, leaving open the possibility that the reported agreement is an artifact of the particular detector architecture and training distribution rather than a general property of visual seamlessness.
[Human Evaluation and Results] The abstract and results section report the 78.82% agreement figure without accompanying error bars, confidence intervals, or statistical significance tests against the state-of-the-art baselines. Because the central claim is that SeamCam outperforms prior metrics by approximately 25%, the absence of these controls makes it impossible to assess whether the improvement is reliable or could be explained by variance in the human study.

minor comments (2)

[Dataset] The description of the CamFG-1.5k curation process should include explicit statistics on image resolution distribution, species diversity, and how the 'fully visible prior to camouflage' criterion was enforced to allow readers to judge potential selection bias.
[Method] Notation for the union operation over masks and the precise definition of the 'maximum recoverable IoU' should be formalized with an equation in the method section to eliminate ambiguity when readers attempt to re-implement the metric.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful comments and the positive evaluation of our work. We address each of the major comments point by point below.

read point-by-point responses

Referee: The headline correlation (78.82% human agreement) rests on the assumption that the highest IoU recoverable from a finite set of category-conditioned detection proposals is a faithful proxy for human visual detectability. No ablation is reported that replaces the proposal generator or alters its conditioning mechanism, leaving open the possibility that the reported agreement is an artifact of the particular detector architecture and training distribution rather than a general property of visual seamlessness.

Authors: We agree that demonstrating robustness across detectors would strengthen the claim. The SeamCam metric is intended to use modern category-conditioned localization as a proxy for detectability, and our choice reflects current SOTA performance. In the revision we will add an ablation replacing the proposal generator with an alternative architecture and conditioning scheme, reporting the resulting human agreement to show the correlation is not detector-specific. revision: yes
Referee: The abstract and results section report the 78.82% agreement figure without accompanying error bars, confidence intervals, or statistical significance tests against the state-of-the-art baselines. Because the central claim is that SeamCam outperforms prior metrics by approximately 25%, the absence of these controls makes it impossible to assess whether the improvement is reliable or could be explained by variance in the human study.

Authors: We acknowledge that statistical controls are necessary to substantiate the reported improvement. We will revise the results section and abstract to include bootstrap-derived error bars, 95% confidence intervals on the agreement rate, and paired statistical tests (e.g., McNemar or binomial) against each baseline to confirm the ~25% gain is statistically reliable rather than attributable to sampling variance in the 2,390 comparisons. revision: yes

Circularity Check

0 steps flagged

No significant circularity in SeamCam's metric definition or human validation

full rationale

The paper explicitly constructs SeamCam as one minus the maximum IoU recoverable from the union of category-conditioned detection proposals and their segmentation masks against a provided ground-truth mask; this is a deliberate proxy definition for detectability rather than a derived claim that reduces to its own inputs. The 78.82% agreement result is obtained from a separate human two-alternative forced-choice study (94 participants, 2,390 comparisons) that serves as external benchmarking, not from fitting or self-referential computation on the same data. The DPO demonstration treats the metric as an independent preference signal for fine-tuning a diffusion inpainting model without any parameter fitting to the metric's outputs or self-citation chains. No self-definitional loops, fitted inputs renamed as predictions, uniqueness theorems imported from prior author work, or ansatzes smuggled via citation appear in the provided derivation chain, leaving the approach self-contained against the human study benchmark.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The metric rests on the assumption that detection proposals plus mask union can approximate human localization difficulty; no explicit free parameters are named in the abstract, but implicit choices include the number of proposals retained and the segmentation model.

axioms (1)

domain assumption Category-conditioned object detectors can generate proposals whose masks meaningfully reflect visual evidence available to humans.
Invoked when defining the max-IoU recoverable localization signal.

pith-pipeline@v0.9.0 · 5860 in / 1314 out tokens · 32374 ms · 2026-05-20T18:22:15.177139+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

SeamCam generates category-conditioned detection proposals, extracts segmentation masks, and identifies the subset whose collective union yields the highest IoU with the ground-truth mask. The SeamCam score is one minus this maximum recoverable localization signal
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

a well-camouflaged animal is one that remains difficult to detect even when its category is known

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

55 extracted references · 55 canonical work pages · 4 internal anchors

[1]

Bulletin of Electrical Engineering and Infor- matics (BEEI)10(4) (2021)

Andang Sunarto, A.: Modified iterative method with red-black ordering for image composition using poisson equation. Bulletin of Electrical Engineering and Infor- matics (BEEI)10(4) (2021)

work page 2021
[2]

IEEE computer Graphics and Applications 23(4), 38–43 (2003)

Ashikhmin, N.: Fast texture transfer. IEEE computer Graphics and Applications 23(4), 38–43 (2003)

work page 2003
[3]

In: Proceedings of the 29th annual conference on Computer graphics and interactive techniques

Barrett, W.A., Cheney, A.S.: Object-based image editing. In: Proceedings of the 29th annual conference on Computer graphics and interactive techniques. pp. 777– 784 (2002)

work page 2002
[4]

Journal of Computer Science and Technology26(6), 1011–1016 (2011)

Bie, X.H., Huang, H.D., Wang, W.C.: Free appearance-editing with improved pois- son image cloning. Journal of Computer Science and Technology26(6), 1011–1016 (2011)

work page 2011
[5]

Demystifying MMD GANs

Bi´ nkowski, M., Sutherland, D.J., Arbel, M., Gretton, A.: Demystifying mmd gans. arXiv preprint arXiv:1801.01401 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[6]

ACM Trans

Chu, H.K., Hsu, W.H., Mitra, N.J., Cohen-Or, D., Wong, T.T., Lee, T.Y.: Cam- ouflage images. ACM Trans. Graph.29(4), 51–1 (2010)

work page 2010
[7]

IEEE transactions on pattern analysis and machine intelligence45(9), 10850–10869 (2023)

Croitoru, F.A., Hondru, V., Ionescu, R.T., Shah, M.: Diffusion models in vision: A survey. IEEE transactions on pattern analysis and machine intelligence45(9), 10850–10869 (2023)

work page 2023
[8]

Advances in Neural Information Processing Systems37, 101116–101143 (2024)

Daras, G., Nie, W., Kreis, K., Dimakis, A., Mardani, M., Kovachki, N., Vahdat, A.: Warped diffusion: Solving video inverse problems with image diffusion models. Advances in Neural Information Processing Systems37, 101116–101143 (2024)

work page 2024
[9]

In: Proceedings of the Com- puter Vision and Pattern Recognition Conference

Das, B., Gopalakrishnan, V.: Camouflage anything: Learning to hide using con- trolled out-painting and representation engineering. In: Proceedings of the Com- puter Vision and Pattern Recognition Conference. pp. 3603–3613 (2025)

work page 2025
[10]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Duan, R., Ma, X., Wang, Y., Bailey, J., Qin, A.K., Yang, Y.: Adversarial cam- ouflage: Hiding physical-world attacks with natural styles. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 1000–1008 (2020)

work page 2020
[11]

In: Seminal Graphics Papers: Pushing the Boundaries, Volume 2, pp

Efros, A.A., Freeman, W.T.: Image quilting for texture synthesis and transfer. In: Seminal Graphics Papers: Pushing the Boundaries, Volume 2, pp. 571–576. Association for Computing Machinery (2023)

work page 2023
[12]

IEEE Transactions on Image Processing26(5), 2338–2351 (2017)

Elad, M., Milanfar, P.: Style transfer via texture synthesis. IEEE Transactions on Image Processing26(5), 2338–2351 (2017)

work page 2017
[13]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Fan, D.P., Ji, G.P., Sun, G., Cheng, M.M., Shen, J., Shao, L.: Camouflaged object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 2777–2787 (2020)

work page 2020
[14]

ACM Transactions on graphics (TOG)28(3), 1–9 (2009) 16 A.K

Farbman, Z., Hoffer, G., Lipman, Y., Cohen-Or, D., Lischinski, D.: Coordinates for instant image cloning. ACM Transactions on graphics (TOG)28(3), 1–9 (2009) 16 A.K. Monsefi et al

work page 2009
[15]

In: European Conference on Computer Vision

Hatamizadeh, A., Song, J., Liu, G., Kautz, J., Vahdat, A.: Diffit: Diffusion vision transformers for image generation. In: European Conference on Computer Vision. pp. 37–55. Springer (2024)

work page 2024
[16]

Advances in neural information processing systems30(2017)

Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems30(2017)

work page 2017
[17]

Advances in neural information processing systems33, 6840–6851 (2020)

Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Advances in neural information processing systems33, 6840–6851 (2020)

work page 2020
[18]

arXiv preprint arXiv:2302.11797 (2023)

Huang, N., Tang, F., Dong, W., Lee, T.Y., Xu, C.: Region-aware diffusion for zero-shot text-driven image editing. arXiv preprint arXiv:2302.11797 (2023)

work page arXiv 2023
[19]

ELCVIA: electronic letters on computer vision and image analysis14(2), 45–57 (2015)

Hussain, K.F., Kamel, R.M.: Efficient poisson image editing. ELCVIA: electronic letters on computer vision and image analysis14(2), 45–57 (2015)

work page 2015
[20]

Machine Intelligence Research 20(1), 92–108 (2023)

Ji, G.P., Fan, D.P., Chou, Y.C., Dai, D., Liniger, A., Van Gool, L.: Deep gradient learning for efficient camouflaged object detection. Machine Intelligence Research 20(1), 92–108 (2023)

work page 2023
[21]

arXiv preprint arXiv:2510.12798 (2025)

Jiang, Q., Huo, J., Chen, X., Xiong, Y., Zeng, Z., Chen, Y., Ren, T., Yu, J., Zhang, L.: Detect anything via next point prediction. arXiv preprint arXiv:2510.12798 (2025)

work page arXiv 2025
[22]

In: European Conference on Computer Vision

Khurana, M., Daw, A., Maruf, M., Uyeda, J.C., Dahdul, W., Charpentier, C., Bakı¸ s, Y., Bart Jr, H.L., Mabee, P.M., Lapp, H., et al.: Hierarchical conditioning of diffusion models using tree-of-life for studying species evolution. In: European Conference on Computer Vision. pp. 137–153. Springer (2024)

work page 2024
[23]

Taxaadapter: Vision taxonomy models are key to fine-grained image generation over the tree of life.arXiv preprint arXiv:2603.26128, 2026

Khurana, M., Monsefi, A.K., Lee, J., Sawhney, M., Carlyn, D., Chae, J., Gu, J., Ramnath, R., Beery, S., Chao, W.L., et al.: Taxaadapter: Vision taxonomy mod- els are key to fine-grained image generation over the tree of life. arXiv preprint arXiv:2603.26128 (2026)

work page arXiv 2026
[24]

In: Proceedings of the 8th International Symposium on Non-Photorealistic Animation and Rendering

Lee, H., Seo, S., Ryoo, S., Yoon, K.: Directional texture transfer. In: Proceedings of the 8th International Symposium on Non-Photorealistic Animation and Rendering. pp. 43–48 (2010)

work page 2010
[25]

In: ACM SIGGRAPH 2006 Research posters, pp

Leventhal, D., Gordon, B., Sibley, P.G.: Poisson image editing extended. In: ACM SIGGRAPH 2006 Research posters, pp. 78–es. the Association for Computing Ma- chinery (ACM) (2006)

work page 2006
[26]

IEEE Transactions on Multimedia25, 5234–5247 (2022)

Li, Y., Zhai, W., Cao, Y., Zha, Z.J.: Location-free camouflage generation network. IEEE Transactions on Multimedia25, 5234–5247 (2022)

work page 2022
[27]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Li, Y., Liu, H., Wu, Q., Mu, F., Yang, J., Gao, J., Li, C., Lee, Y.J.: Gligen: Open-set grounded text-to-image generation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 22511–22521 (2023)

work page 2023
[28]

In: European conference on computer vision

Liu, S., Zeng, Z., Ren, T., Li, F., Zhang, H., Yang, J., Jiang, Q., Li, C., Yang, J., Su, H., et al.: Grounding dino: Marrying dino with grounded pre-training for open-set object detection. In: European conference on computer vision. pp. 38–55. Springer (2024)

work page 2024
[29]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., Van Gool, L.: Re- paint: Inpainting using denoising diffusion probabilistic models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 11461– 11471 (2022)

work page 2022
[30]

Philosoph- ical Transactions of the Royal Society B: Biological Sciences372(1724) (2017)

Merilaita, S., Scott-Samuel, N.E., Cuthill, I.C.: How camouflage works. Philosoph- ical Transactions of the Royal Society B: Biological Sciences372(1724) (2017)

work page 2017
[31]

Minderer, M., Gritsenko, A., Houlsby, N.: Scaling open-vocabulary object detection (2023) SeamCam 17

work page 2023
[32]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Monsefi, A.K., Khurana, M., Ramnath, R., Karpatne, A., Chao, W.L., Zhang, C.: Taxadiffusion: Progressively trained diffusion model for fine-grained species gener- ation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 8579–8589 (2025)

work page 2025
[33]

arXiv preprint arXiv:2409.06809 (2024)

Monsefi, A.K., Sailaja, K.P., Alilooee, A., Lim, S.N., Ramnath, R.: Detailclip: Detail-oriented clip for fine-grained tasks. arXiv preprint arXiv:2409.06809 (2024)

work page arXiv 2024
[34]

IEEE Transactions on Pattern Analysis and Ma- chine Intelligence47(2), 1161–1180 (2024)

Montesuma, E.F., Mboula, F.M.N., Souloumiac, A.: Recent advances in optimal transport for machine learning. IEEE Transactions on Pattern Analysis and Ma- chine Intelligence47(2), 1161–1180 (2024)

work page 2024
[35]

Pattern Recognition Letters33(3), 342–348 (2012)

Morel, J.M., Petro, A.B., Sbert, C.: Fourier implementation of poisson image edit- ing. Pattern Recognition Letters33(3), 342–348 (2012)

work page 2012
[36]

In: Proceedings of the Computer Vision and Pattern Recog- nition Conference

Na, S., Kim, Y., Lee, H.: Boost your human image generation model via direct pref- erence optimization. In: Proceedings of the Computer Vision and Pattern Recog- nition Conference. pp. 23551–23562 (2025)

work page 2025
[37]

Knobgen: Controlling the sophistication of artwork in sketch-based diffusion models

Navard, P., Monsefi, A.K., Zhou, M., Chao, W.L., Yilmaz, A., Ramnath, R.: Knob- gen: controlling the sophistication of artwork in sketch-based diffusion models. arXiv preprint arXiv:2410.01595 (2024)

work page arXiv 2024
[38]

IEEE Access (2024)

Nguyen, T.D., Vu, A.K.N., Nguyen, N.D., Nguyen, V.T., Ngo, T.D., Do, T.T., Tran, M.T., Nguyen, T.V.: The art of camouflage: Few-shot learning for animal detection and segmentation. IEEE Access (2024)

work page 2024
[39]

arXiv preprint arXiv:2601.09881 (2026)

Nie, W., Berner, J., Ma, N., Liu, C., Xie, S., Vahdat, A.: Transition matching distillation for fast video generation. arXiv preprint arXiv:2601.09881 (2026)

work page arXiv 2026
[40]

In: Seminal Graphics Pa- pers: Pushing the Boundaries, Volume 2, pp

P´ erez, P., Gangnet, M., Blake, A.: Poisson image editing. In: Seminal Graphics Pa- pers: Pushing the Boundaries, Volume 2, pp. 577–582. Association for Computing Machinery (2023)

work page 2023
[41]

Advances in neural information processing systems36, 53728–53741 (2023)

Rafailov, R., Sharma, A., Mitchell, E., Manning, C.D., Ermon, S., Finn, C.: Direct preference optimization: Your language model is secretly a reward model. Advances in neural information processing systems36, 53728–53741 (2023)

work page 2023
[42]

SAM 2: Segment Anything in Images and Videos

Ravi, N., Gabeur, V., Hu, Y.T., Hu, R., Ryali, C., Ma, T., Khedr, H., R¨ adle, R., Rolland, C., Gustafson, L., et al.: Sam 2: Segment anything in images and videos. arXiv preprint arXiv:2408.00714 (2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[43]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 10684–10695 (2022)

work page 2022
[44]

In: International Conference on Learning Representations (2020)

Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. In: International Conference on Learning Representations (2020)

work page 2020
[45]

AudioX: A Unified Framework for Anything-to-Audio Generation

Tian, Z., Jin, Y., Liu, Z., Yuan, R., Tan, X., Chen, Q., Xue, W., Guo, Y.: Audiox: Diffusion transformer for anything-to-audio generation. arXiv preprint arXiv:2503.10522 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[46]

Perception & Psychophysics66(3), 517–533 (2004)

Ulrich, R., Miller, J.: Threshold estimation in two-alternative forced-choice (2afc) tasks: The spearman-k¨ arber method. Perception & Psychophysics66(3), 517–533 (2004)

work page 2004
[47]

In: Proceedings of the IEEE conference on computer vision and pattern recognition

Van Horn, G., Mac Aodha, O., Song, Y., Cui, Y., Sun, C., Shepard, A., Adam, H., Perona, P., Belongie, S.: The inaturalist species classification and detection dataset. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 8769–8778 (2018)

work page 2018
[48]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Wallace, B., Dang, M., Rafailov, R., Zhou, L., Lou, A., Purushwalkam, S., Ermon, S., Xiong, C., Joty, S., Naik, N.: Diffusion model alignment using direct preference optimization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 8228–8238 (2024) 18 A.K. Monsefi et al

work page 2024
[49]

arXiv preprint arXiv:2512.07076 (2024)

Wang, C.Y., Ji, G.P., Shao, S., Cheng, M.M., Fan, D.P.: Context-measure: Con- textualizing metric for camouflage. arXiv preprint arXiv:2512.07076 (2024)

work page arXiv 2024
[50]

In: Proceedings of the AAAI Conference on Artificial Intelligence

Wang, Z., Zhao, L., Chen, H., Li, A., Zuo, Z., Xing, W., Lu, D.: Texture reformer: Towards fast and universal interactive texture transfer. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 36, pp. 2624–2632 (2022)

work page 2022
[51]

Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis

Wu, X., Hao, Y., Sun, K., Chen, Y., Zhu, F., Zhao, R., Li, H.: Human preference score v2: A solid benchmark for evaluating human preferences of text-to-image synthesis. arXiv preprint arXiv:2306.09341 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[52]

IEEE Access7, 114619–114630 (2019)

Yu, T., Song, K., Miao, P., Yang, G., Yang, H., Chen, C.: Nighttime single image dehazing via pixel-wise alpha blending. IEEE Access7, 114619–114630 (2019)

work page 2019
[53]

In: Proceedings of the AAAI conference on artificial intelligence

Zhang, Q., Yin, G., Nie, Y., Zheng, W.S.: Deep camouflage images. In: Proceedings of the AAAI conference on artificial intelligence. vol. 34, pp. 12845–12852 (2020)

work page 2020
[54]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Zhao, P., Xu, P., Qin, P., Fan, D.P., Zhang, Z., Jia, G., Zhou, B., Yang, J.: Lake- red: Camouflaged images generation by latent background knowledge retrieval- augmented diffusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 4092–4101 (2024)

work page 2024
[55]

A camouflaged image of{c}

Zheng, C., Cham, T.J., Cai, J., Phung, D.: Bridging global context interactions for high-fidelity image completion. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 11512–11522 (2022) SeamCam 19 Appendix This appendix provides supplementary details that support the main paper. Section A describes the training and ...

work page 2022

[1] [1]

Bulletin of Electrical Engineering and Infor- matics (BEEI)10(4) (2021)

Andang Sunarto, A.: Modified iterative method with red-black ordering for image composition using poisson equation. Bulletin of Electrical Engineering and Infor- matics (BEEI)10(4) (2021)

work page 2021

[2] [2]

IEEE computer Graphics and Applications 23(4), 38–43 (2003)

Ashikhmin, N.: Fast texture transfer. IEEE computer Graphics and Applications 23(4), 38–43 (2003)

work page 2003

[3] [3]

In: Proceedings of the 29th annual conference on Computer graphics and interactive techniques

Barrett, W.A., Cheney, A.S.: Object-based image editing. In: Proceedings of the 29th annual conference on Computer graphics and interactive techniques. pp. 777– 784 (2002)

work page 2002

[4] [4]

Journal of Computer Science and Technology26(6), 1011–1016 (2011)

Bie, X.H., Huang, H.D., Wang, W.C.: Free appearance-editing with improved pois- son image cloning. Journal of Computer Science and Technology26(6), 1011–1016 (2011)

work page 2011

[5] [5]

Demystifying MMD GANs

Bi´ nkowski, M., Sutherland, D.J., Arbel, M., Gretton, A.: Demystifying mmd gans. arXiv preprint arXiv:1801.01401 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[6] [6]

ACM Trans

Chu, H.K., Hsu, W.H., Mitra, N.J., Cohen-Or, D., Wong, T.T., Lee, T.Y.: Cam- ouflage images. ACM Trans. Graph.29(4), 51–1 (2010)

work page 2010

[7] [7]

IEEE transactions on pattern analysis and machine intelligence45(9), 10850–10869 (2023)

Croitoru, F.A., Hondru, V., Ionescu, R.T., Shah, M.: Diffusion models in vision: A survey. IEEE transactions on pattern analysis and machine intelligence45(9), 10850–10869 (2023)

work page 2023

[8] [8]

Advances in Neural Information Processing Systems37, 101116–101143 (2024)

Daras, G., Nie, W., Kreis, K., Dimakis, A., Mardani, M., Kovachki, N., Vahdat, A.: Warped diffusion: Solving video inverse problems with image diffusion models. Advances in Neural Information Processing Systems37, 101116–101143 (2024)

work page 2024

[9] [9]

In: Proceedings of the Com- puter Vision and Pattern Recognition Conference

Das, B., Gopalakrishnan, V.: Camouflage anything: Learning to hide using con- trolled out-painting and representation engineering. In: Proceedings of the Com- puter Vision and Pattern Recognition Conference. pp. 3603–3613 (2025)

work page 2025

[10] [10]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Duan, R., Ma, X., Wang, Y., Bailey, J., Qin, A.K., Yang, Y.: Adversarial cam- ouflage: Hiding physical-world attacks with natural styles. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 1000–1008 (2020)

work page 2020

[11] [11]

In: Seminal Graphics Papers: Pushing the Boundaries, Volume 2, pp

Efros, A.A., Freeman, W.T.: Image quilting for texture synthesis and transfer. In: Seminal Graphics Papers: Pushing the Boundaries, Volume 2, pp. 571–576. Association for Computing Machinery (2023)

work page 2023

[12] [12]

IEEE Transactions on Image Processing26(5), 2338–2351 (2017)

Elad, M., Milanfar, P.: Style transfer via texture synthesis. IEEE Transactions on Image Processing26(5), 2338–2351 (2017)

work page 2017

[13] [13]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Fan, D.P., Ji, G.P., Sun, G., Cheng, M.M., Shen, J., Shao, L.: Camouflaged object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 2777–2787 (2020)

work page 2020

[14] [14]

ACM Transactions on graphics (TOG)28(3), 1–9 (2009) 16 A.K

Farbman, Z., Hoffer, G., Lipman, Y., Cohen-Or, D., Lischinski, D.: Coordinates for instant image cloning. ACM Transactions on graphics (TOG)28(3), 1–9 (2009) 16 A.K. Monsefi et al

work page 2009

[15] [15]

In: European Conference on Computer Vision

Hatamizadeh, A., Song, J., Liu, G., Kautz, J., Vahdat, A.: Diffit: Diffusion vision transformers for image generation. In: European Conference on Computer Vision. pp. 37–55. Springer (2024)

work page 2024

[16] [16]

Advances in neural information processing systems30(2017)

Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems30(2017)

work page 2017

[17] [17]

Advances in neural information processing systems33, 6840–6851 (2020)

Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Advances in neural information processing systems33, 6840–6851 (2020)

work page 2020

[18] [18]

arXiv preprint arXiv:2302.11797 (2023)

Huang, N., Tang, F., Dong, W., Lee, T.Y., Xu, C.: Region-aware diffusion for zero-shot text-driven image editing. arXiv preprint arXiv:2302.11797 (2023)

work page arXiv 2023

[19] [19]

ELCVIA: electronic letters on computer vision and image analysis14(2), 45–57 (2015)

Hussain, K.F., Kamel, R.M.: Efficient poisson image editing. ELCVIA: electronic letters on computer vision and image analysis14(2), 45–57 (2015)

work page 2015

[20] [20]

Machine Intelligence Research 20(1), 92–108 (2023)

Ji, G.P., Fan, D.P., Chou, Y.C., Dai, D., Liniger, A., Van Gool, L.: Deep gradient learning for efficient camouflaged object detection. Machine Intelligence Research 20(1), 92–108 (2023)

work page 2023

[21] [21]

arXiv preprint arXiv:2510.12798 (2025)

Jiang, Q., Huo, J., Chen, X., Xiong, Y., Zeng, Z., Chen, Y., Ren, T., Yu, J., Zhang, L.: Detect anything via next point prediction. arXiv preprint arXiv:2510.12798 (2025)

work page arXiv 2025

[22] [22]

In: European Conference on Computer Vision

Khurana, M., Daw, A., Maruf, M., Uyeda, J.C., Dahdul, W., Charpentier, C., Bakı¸ s, Y., Bart Jr, H.L., Mabee, P.M., Lapp, H., et al.: Hierarchical conditioning of diffusion models using tree-of-life for studying species evolution. In: European Conference on Computer Vision. pp. 137–153. Springer (2024)

work page 2024

[23] [23]

Taxaadapter: Vision taxonomy models are key to fine-grained image generation over the tree of life.arXiv preprint arXiv:2603.26128, 2026

Khurana, M., Monsefi, A.K., Lee, J., Sawhney, M., Carlyn, D., Chae, J., Gu, J., Ramnath, R., Beery, S., Chao, W.L., et al.: Taxaadapter: Vision taxonomy mod- els are key to fine-grained image generation over the tree of life. arXiv preprint arXiv:2603.26128 (2026)

work page arXiv 2026

[24] [24]

In: Proceedings of the 8th International Symposium on Non-Photorealistic Animation and Rendering

Lee, H., Seo, S., Ryoo, S., Yoon, K.: Directional texture transfer. In: Proceedings of the 8th International Symposium on Non-Photorealistic Animation and Rendering. pp. 43–48 (2010)

work page 2010

[25] [25]

In: ACM SIGGRAPH 2006 Research posters, pp

Leventhal, D., Gordon, B., Sibley, P.G.: Poisson image editing extended. In: ACM SIGGRAPH 2006 Research posters, pp. 78–es. the Association for Computing Ma- chinery (ACM) (2006)

work page 2006

[26] [26]

IEEE Transactions on Multimedia25, 5234–5247 (2022)

Li, Y., Zhai, W., Cao, Y., Zha, Z.J.: Location-free camouflage generation network. IEEE Transactions on Multimedia25, 5234–5247 (2022)

work page 2022

[27] [27]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Li, Y., Liu, H., Wu, Q., Mu, F., Yang, J., Gao, J., Li, C., Lee, Y.J.: Gligen: Open-set grounded text-to-image generation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 22511–22521 (2023)

work page 2023

[28] [28]

In: European conference on computer vision

Liu, S., Zeng, Z., Ren, T., Li, F., Zhang, H., Yang, J., Jiang, Q., Li, C., Yang, J., Su, H., et al.: Grounding dino: Marrying dino with grounded pre-training for open-set object detection. In: European conference on computer vision. pp. 38–55. Springer (2024)

work page 2024

[29] [29]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., Van Gool, L.: Re- paint: Inpainting using denoising diffusion probabilistic models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 11461– 11471 (2022)

work page 2022

[30] [30]

Philosoph- ical Transactions of the Royal Society B: Biological Sciences372(1724) (2017)

Merilaita, S., Scott-Samuel, N.E., Cuthill, I.C.: How camouflage works. Philosoph- ical Transactions of the Royal Society B: Biological Sciences372(1724) (2017)

work page 2017

[31] [31]

Minderer, M., Gritsenko, A., Houlsby, N.: Scaling open-vocabulary object detection (2023) SeamCam 17

work page 2023

[32] [32]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Monsefi, A.K., Khurana, M., Ramnath, R., Karpatne, A., Chao, W.L., Zhang, C.: Taxadiffusion: Progressively trained diffusion model for fine-grained species gener- ation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 8579–8589 (2025)

work page 2025

[33] [33]

arXiv preprint arXiv:2409.06809 (2024)

Monsefi, A.K., Sailaja, K.P., Alilooee, A., Lim, S.N., Ramnath, R.: Detailclip: Detail-oriented clip for fine-grained tasks. arXiv preprint arXiv:2409.06809 (2024)

work page arXiv 2024

[34] [34]

IEEE Transactions on Pattern Analysis and Ma- chine Intelligence47(2), 1161–1180 (2024)

Montesuma, E.F., Mboula, F.M.N., Souloumiac, A.: Recent advances in optimal transport for machine learning. IEEE Transactions on Pattern Analysis and Ma- chine Intelligence47(2), 1161–1180 (2024)

work page 2024

[35] [35]

Pattern Recognition Letters33(3), 342–348 (2012)

Morel, J.M., Petro, A.B., Sbert, C.: Fourier implementation of poisson image edit- ing. Pattern Recognition Letters33(3), 342–348 (2012)

work page 2012

[36] [36]

In: Proceedings of the Computer Vision and Pattern Recog- nition Conference

Na, S., Kim, Y., Lee, H.: Boost your human image generation model via direct pref- erence optimization. In: Proceedings of the Computer Vision and Pattern Recog- nition Conference. pp. 23551–23562 (2025)

work page 2025

[37] [37]

Knobgen: Controlling the sophistication of artwork in sketch-based diffusion models

Navard, P., Monsefi, A.K., Zhou, M., Chao, W.L., Yilmaz, A., Ramnath, R.: Knob- gen: controlling the sophistication of artwork in sketch-based diffusion models. arXiv preprint arXiv:2410.01595 (2024)

work page arXiv 2024

[38] [38]

IEEE Access (2024)

Nguyen, T.D., Vu, A.K.N., Nguyen, N.D., Nguyen, V.T., Ngo, T.D., Do, T.T., Tran, M.T., Nguyen, T.V.: The art of camouflage: Few-shot learning for animal detection and segmentation. IEEE Access (2024)

work page 2024

[39] [39]

arXiv preprint arXiv:2601.09881 (2026)

Nie, W., Berner, J., Ma, N., Liu, C., Xie, S., Vahdat, A.: Transition matching distillation for fast video generation. arXiv preprint arXiv:2601.09881 (2026)

work page arXiv 2026

[40] [40]

In: Seminal Graphics Pa- pers: Pushing the Boundaries, Volume 2, pp

P´ erez, P., Gangnet, M., Blake, A.: Poisson image editing. In: Seminal Graphics Pa- pers: Pushing the Boundaries, Volume 2, pp. 577–582. Association for Computing Machinery (2023)

work page 2023

[41] [41]

Advances in neural information processing systems36, 53728–53741 (2023)

Rafailov, R., Sharma, A., Mitchell, E., Manning, C.D., Ermon, S., Finn, C.: Direct preference optimization: Your language model is secretly a reward model. Advances in neural information processing systems36, 53728–53741 (2023)

work page 2023

[42] [42]

SAM 2: Segment Anything in Images and Videos

Ravi, N., Gabeur, V., Hu, Y.T., Hu, R., Ryali, C., Ma, T., Khedr, H., R¨ adle, R., Rolland, C., Gustafson, L., et al.: Sam 2: Segment anything in images and videos. arXiv preprint arXiv:2408.00714 (2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024

[43] [43]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 10684–10695 (2022)

work page 2022

[44] [44]

In: International Conference on Learning Representations (2020)

Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. In: International Conference on Learning Representations (2020)

work page 2020

[45] [45]

AudioX: A Unified Framework for Anything-to-Audio Generation

Tian, Z., Jin, Y., Liu, Z., Yuan, R., Tan, X., Chen, Q., Xue, W., Guo, Y.: Audiox: Diffusion transformer for anything-to-audio generation. arXiv preprint arXiv:2503.10522 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[46] [46]

Perception & Psychophysics66(3), 517–533 (2004)

Ulrich, R., Miller, J.: Threshold estimation in two-alternative forced-choice (2afc) tasks: The spearman-k¨ arber method. Perception & Psychophysics66(3), 517–533 (2004)

work page 2004

[47] [47]

In: Proceedings of the IEEE conference on computer vision and pattern recognition

Van Horn, G., Mac Aodha, O., Song, Y., Cui, Y., Sun, C., Shepard, A., Adam, H., Perona, P., Belongie, S.: The inaturalist species classification and detection dataset. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 8769–8778 (2018)

work page 2018

[48] [48]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Wallace, B., Dang, M., Rafailov, R., Zhou, L., Lou, A., Purushwalkam, S., Ermon, S., Xiong, C., Joty, S., Naik, N.: Diffusion model alignment using direct preference optimization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 8228–8238 (2024) 18 A.K. Monsefi et al

work page 2024

[49] [49]

arXiv preprint arXiv:2512.07076 (2024)

Wang, C.Y., Ji, G.P., Shao, S., Cheng, M.M., Fan, D.P.: Context-measure: Con- textualizing metric for camouflage. arXiv preprint arXiv:2512.07076 (2024)

work page arXiv 2024

[50] [50]

In: Proceedings of the AAAI Conference on Artificial Intelligence

Wang, Z., Zhao, L., Chen, H., Li, A., Zuo, Z., Xing, W., Lu, D.: Texture reformer: Towards fast and universal interactive texture transfer. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 36, pp. 2624–2632 (2022)

work page 2022

[51] [51]

Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis

Wu, X., Hao, Y., Sun, K., Chen, Y., Zhu, F., Zhao, R., Li, H.: Human preference score v2: A solid benchmark for evaluating human preferences of text-to-image synthesis. arXiv preprint arXiv:2306.09341 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023

[52] [52]

IEEE Access7, 114619–114630 (2019)

Yu, T., Song, K., Miao, P., Yang, G., Yang, H., Chen, C.: Nighttime single image dehazing via pixel-wise alpha blending. IEEE Access7, 114619–114630 (2019)

work page 2019

[53] [53]

In: Proceedings of the AAAI conference on artificial intelligence

Zhang, Q., Yin, G., Nie, Y., Zheng, W.S.: Deep camouflage images. In: Proceedings of the AAAI conference on artificial intelligence. vol. 34, pp. 12845–12852 (2020)

work page 2020

[54] [54]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Zhao, P., Xu, P., Qin, P., Fan, D.P., Zhang, Z., Jia, G., Zhou, B., Yang, J.: Lake- red: Camouflaged images generation by latent background knowledge retrieval- augmented diffusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 4092–4101 (2024)

work page 2024

[55] [55]

A camouflaged image of{c}

Zheng, C., Cham, T.J., Cai, J., Phung, D.: Bridging global context interactions for high-fidelity image completion. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 11512–11522 (2022) SeamCam 19 Appendix This appendix provides supplementary details that support the main paper. Section A describes the training and ...

work page 2022