pith. sign in

arxiv: 2605.26855 · v1 · pith:5WKR3BQFnew · submitted 2026-05-26 · 💻 cs.CV

Receipt Replay OOD: A Small Benchmark for Screen Replay Detection Under Domain Shift

Pith reviewed 2026-06-29 18:20 UTC · model grok-4.3

classification 💻 cs.CV
keywords screen replay detectionout-of-domain robustnesspresentation attack detectiondocument verificationbenchmark datasetdomain shift
0
0 comments X

The pith

Receipts form a privacy-safe out-of-domain test for screen replay detectors trained on identity documents.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Receipt Replay OOD as a small benchmark to measure how screen replay detection models handle domain shift away from identity documents. Receipts match identity documents in planar shape, curved corners, wear artifacts, and printed patterns but carry no personal information. Cross-domain tests on this set show clear drops in model performance. This setup lets researchers probe generalization limits without the privacy barriers that block larger public ID datasets.

Core claim

Receipt Replay OOD demonstrates that document replay detection models suffer measurable performance loss when evaluated on receipts, which share planar geometry, curved corners, wear-and-tear artifacts, and text or logo patterns with identity documents yet avoid personally identifiable information constraints.

What carries the argument

The Receipt Replay OOD benchmark dataset, used as an out-of-domain test set to expose generalization failures under domain shift.

If this is right

  • Existing replay detectors trained on public ID datasets will show reduced accuracy on receipt images.
  • Domain shift between document types directly degrades presentation attack detection performance.
  • Receipts can substitute for identity documents in robustness testing while sidestepping data-privacy rules.
  • Benchmark results quantify the size of the generalization gap that future training methods must close.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Larger receipt collections with varied lighting and camera angles could strengthen the benchmark for future OOD studies.
  • Training recipes that explicitly regularize for shape and texture invariance might reduce the observed domain gap.
  • The same receipt proxy idea could apply to other document-security tasks that currently face PII restrictions.

Load-bearing premise

Receipts share enough geometric, textural, and artifact characteristics with identity documents to make performance on receipts a meaningful signal of out-of-domain robustness.

What would settle it

A controlled experiment in which the same models are tested on both Receipt Replay OOD and a held-out set of real identity documents under matched replay conditions; if accuracy patterns diverge sharply, the proxy claim fails.

read the original abstract

Public datasets such as DLC-2021, SynID, and KID34K have significantly contributed to research on presentation attack detection for identity documents, including screen replay attacks. However, evaluation of out-of-domain (OOD) robustness remains insufficiently explored, especially under realistic domain shifts. In this work, we introduce Receipt Replay OOD, a small out-of-domain benchmark for screen replay detection. Receipts share several characteristics with identity documents, including planar geometry, curved corners, wear-and-tear artifacts, and text or logo patterns, while avoiding personally identifiable information constraints commonly associated with identity documents. We evaluate document replay detection models under cross-domain conditions and demonstrate the impact of domain shift on generalization performance. The dataset is publicly available.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript introduces Receipt Replay OOD, a small out-of-domain benchmark for screen replay detection. Receipts are proposed as a suitable proxy for identity documents because they share planar geometry, curved corners, wear-and-tear artifacts, and text or logo patterns, while avoiding PII constraints. The work evaluates document replay detection models under cross-domain conditions and demonstrates the impact of domain shift on generalization performance. The dataset is made publicly available.

Significance. If the evaluation supports the claims, this benchmark fills a gap in OOD robustness testing for presentation attack detection by providing a PII-free alternative to identity document datasets such as DLC-2021, SynID, and KID34K. The public availability of the dataset supports reproducibility and further research in the field. This is a modest but useful contribution for testing generalization of replay detectors.

minor comments (2)
  1. The abstract would be strengthened by including at least one quantitative result illustrating the domain shift impact.
  2. [Dataset] Provide more details on the number of samples, acquisition conditions, and any preprocessing steps applied to the receipt images in the dataset description section.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of Receipt Replay OOD as a useful PII-free benchmark for OOD robustness in screen replay detection, and for recommending minor revision. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper introduces a benchmark dataset and reports cross-domain evaluation results. It contains no equations, derivations, parameter fitting, or predictive claims that could reduce to inputs by construction. The justification for receipts as a proxy for ID documents is an explicit design assumption rather than a derived result. No self-citations are load-bearing for any central claim.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central contribution rests on the domain assumption that receipts are suitable visual proxies for identity documents in replay detection tasks.

axioms (1)
  • domain assumption Receipts share planar geometry, curved corners, wear-and-tear artifacts, and text or logo patterns with identity documents.
    Invoked in abstract to justify use of receipts as OOD proxy while avoiding PII.

pith-pipeline@v0.9.1-grok · 5644 in / 1074 out tokens · 37582 ms · 2026-06-29T18:20:03.670929+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

9 extracted references · 6 canonical work pages

  1. [1]

    Introduction and Related Works Multiple datasets have been introduced for presentation attack detection in the domain of identity documents. The DLC-2021 dataset (Polevoy et al., 2022), based on images from the MIDV family, contains 10 document types and multiple attack categories including color photocopies, grayscale copies, and screen replay attacks ca...

  2. [2]

    At the same time, receipts avoid legal and privacy limitations associated with identity documents, making them suitable for public OOD benchmarking

    Dataset description Receipts were selected as lightweight real-world planar objects sharing several visual characteristics with identity documents, including text regions, curved geometry, folds, and wear-and-tear artifacts. At the same time, receipts avoid legal and privacy limitations associated with identity documents, making them suitable for public O...

  3. [3]

    Experiments To support this research, we trained three models for the screen replay detection task using the original DLC-2021 (RE) train-test split. The evaluated architectures included a custom ResNet-inspired model trained from scratch, EfficientNet-B0V2 fine-tuned from ImageNet pretraining, and ViT-Small with frozen DINOv2 backbone and fine-tuned clas...

  4. [4]

    Park, E.-J., Back, S.-Y., Kim, J., & Woo, S. S. (2023). KID34K: A dataset for online identity card fraud detection . In Proceedings of the 2023 ACM Workshop on Information Hiding and Multimedia Security (pp. 191–196). ACM. https://doi.org/10.1145/3583780.3615122

  5. [5]

    V., Sigareva, I

    Polevoy, D. V., Sigareva, I. V., Ershova, D. M., Arlazarov, V. V., Nikolaev, D. P., Ming, Z., Luqman, M. M., & Burie, J.-C. (2022). Document liveness challenge dataset (DLC-2021) . Journal of Imaging, 8 (7), 181. https://doi.org/10.3390/jimaging8070181

  6. [6]

    Steinmann, D., Divo, F., Kraus, M., Wüst, A., Struppek, L., Friedrich, F., & Kersting, K. (2024). Navigating shortcuts, spurious correlations, and confounders: From origins via detection to mitigation . arXiv:2412.05152

  7. [7]

    Stehouwer, J., Jourabloo, A., Liu, Y., & Liu, X. (2020). Noise modeling, synthesis and classification for generic object anti-spoofing . arXiv:2003.13043

  8. [8]

    E., Stockhardt, F., González-Soler, L

    Tapia, J. E., Stockhardt, F., González-Soler, L. J., & Busch, C. (2025). SynID: Passport synthetic dataset for presentation attack detection . arXiv:2505.07540

  9. [9]

    Vinogradov, A. (2025). Can generative models actually forge realistic identity documents? arXiv:2601.00829. 2 DLC ROC AUC RR OOD ROC AUC Δ ROC AUC Custom CNN 89.5 47.75 -41.75 EfficientNet-B0V2 88.45 65.32 -23.14 ViT-S/DINOv2 87.12 82.17 -4.96