Ninja Codes: Neurally Generated Fiducial Markers for Stealthy 6-DoF Tracking

Shunya Kato; Yuichiro Takeuchi; Yusuke Imoto

arxiv: 2510.18976 · v2 · submitted 2025-10-21 · 💻 cs.CV · cs.HC

Ninja Codes: Neurally Generated Fiducial Markers for Stealthy 6-DoF Tracking

Yuichiro Takeuchi , Yusuke Imoto , Shunya Kato This is my paper

Pith reviewed 2026-05-18 04:28 UTC · model grok-4.3

classification 💻 cs.CV cs.HC

keywords fiducial markersneural image encoding6-DoF pose estimationstealthy trackingaugmented realityrobotics visionmarker detection

0 comments

The pith

Neural networks generate printable markers that blend into surroundings while supporting accurate 6-DoF tracking via RGB cameras.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to turn ordinary images into fiducial markers using a neural encoder that makes only small visual changes. These Ninja Codes can be printed on regular paper and then detected by standard RGB cameras running inference to give precise position and orientation in three dimensions. This matters because traditional markers are obvious and can spoil the appearance of a space, limiting their use in homes, offices, or artistic installations. If the approach works as described, tracking technology becomes practical in more everyday settings without the visual clutter. A reader would care if they want to add AR elements or robot navigation to real environments without obvious stickers or patterns.

Core claim

Ninja Codes are created by an encoder network that applies modest alterations to arbitrary images, allowing the printed results to provide stealthy 6-DoF location tracking when detected by RGB camera inference, while naturally blending into various real-world environmental textures under typical indoor lighting.

What carries the argument

The encoder network that modifies input images with subtle alterations to embed tracking information while maintaining visual similarity to the original texture.

If this is right

Tracking systems can operate in visually sensitive areas where standard fiducial markers would be too noticeable.
Applications in robotics and augmented reality gain access to reliable pose data from everyday surfaces.
Deployment requires only standard color printers and consumer devices with cameras and inference support.
Markers adapt to different backgrounds by starting from images that match the target texture.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Extending training to more lighting conditions could allow use outdoors or in dynamic environments.
This method might combine with other computer vision tasks like object recognition on the same images.
Future work could test durability of the printed codes over time or under wear.

Load-bearing premise

The alterations made by the network are small enough to blend into many different textures and lighting conditions but large enough to allow reliable extraction of 6-DoF pose information.

What would settle it

Printing Ninja Codes on multiple surface types, placing them in varied indoor scenes with common lighting, and checking if the detection succeeds in providing correct 6-DoF estimates in most cases; failure would be if accuracy drops significantly.

Figures

Figures reproduced from arXiv: 2510.18976 by Shunya Kato, Yuichiro Takeuchi, Yusuke Imoto.

**Figure 1.** Figure 1: We present Ninja Codes, inconspicuous fiducial markers that can be made to blend into various real-world environ [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗

**Figure 2.** Figure 2: Ninja Codes end-to-end training architecture. A total of five modules are trained simultaneously: encoder, decoder, [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Noise functions to simulate perturbations. Perturbations owing to the printing method/material are simulated using [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: We employ a two-phase training process. After the [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: The 25 digital images used to evaluate code detection [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: Poster boards used to evaluate 6-DoF tracking per [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 8.** Figure 8: Average time from image display to code detection [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗

**Figure 9.** Figure 9: Ninja Codes (top) and a simple augmented reality [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗

**Figure 10.** Figure 10: Artifacts register more strongly for plain or light [PITH_FULL_IMAGE:figures/full_fig_p009_10.png] view at source ↗

**Figure 11.** Figure 11: A reverse encoder that takes a Ninja Code as input [PITH_FULL_IMAGE:figures/full_fig_p009_11.png] view at source ↗

**Figure 12.** Figure 12: Faulty color calibration results in color discontinu [PITH_FULL_IMAGE:figures/full_fig_p010_12.png] view at source ↗

read the original abstract

In this paper we describe Ninja Codes, neurally generated fiducial markers that can be made to naturally blend into various real-world environments. An encoder network converts arbitrary images into Ninja Codes by applying visually modest alterations; the resulting codes, printed and pasted onto surfaces, can provide stealthy 6-DoF location tracking for a wide range of applications including robotics and augmented reality. Ninja Codes can be printed using standard color printers on regular printing paper, and can be detected using any device equipped with a modern RGB camera and capable of running inference. Through experiments, we demonstrate Ninja Codes' ability to provide reliable location tracking under common indoor lighting conditions, while successfully concealing themselves within diverse environmental textures. We expect Ninja Codes to offer particular value in scenarios where the conspicuous appearance of conventional fiducial markers makes them undesirable for aesthetic and other reasons.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Ninja Codes use a neural encoder to tweak images into printable markers that hide in scenes for 6-DoF tracking, but the evidence for reliable performance across real textures and lighting is still thin.

read the letter

Hi, the main point here is a neural encoder that takes an arbitrary image and adds modest changes so the result can be printed as a fiducial marker that blends into the background while still supporting 6-DoF pose recovery from a regular RGB camera. That framing is the actual new piece compared with hand-designed markers like AprilTags. They keep the setup practical by targeting standard printers and off-the-shelf cameras, which is a clear plus for robotics or AR use cases where obvious patterns are a problem. The indoor experiments they describe sound like a reasonable starting point for showing the idea can work under controlled conditions. The soft spot is exactly the trade-off the stress-test note flags. Small visual changes that preserve natural appearance can easily remove the structured signal a detector needs for accurate pose, especially once you add printing artifacts, camera noise, and lighting shifts. The abstract claims success on diverse textures, but without reported pose errors, detection rates, perceptual metrics, or failure cases, it is hard to tell whether the modifications stay in the sweet spot. If the full paper has those numbers and comparisons, the concern shrinks; otherwise it stays central. This is aimed at CV and robotics groups already working on marker-based tracking who might want a less obtrusive option. A reader could get value from trying the encoder idea even if the current results need tightening. It is worth sending to peer review so referees can press on the experimental details and see whether the central claim actually holds.

Referee Report

1 major / 1 minor

Summary. The paper introduces Ninja Codes, neurally generated fiducial markers created by an encoder network that applies visually modest alterations to arbitrary images. These markers can be printed on standard paper with color printers and detected via RGB cameras on devices running inference, enabling stealthy 6-DoF pose estimation for robotics and AR. Experiments claim reliable tracking under common indoor lighting while the markers blend into diverse environmental textures.

Significance. If the central claims hold, Ninja Codes could provide a practical advance over traditional conspicuous fiducial markers by enabling aesthetically integrated tracking. This has potential value in applications where visible markers are undesirable, combining neural image synthesis with pose regression in a way that could influence future work on unobtrusive computer vision systems.

major comments (1)

[Experiments] The experimental description provides no quantitative perceptual metrics (e.g., SSIM, LPIPS, or user studies) to verify that the neural alterations remain modest enough to blend naturally, nor failure-case analysis or accuracy metrics (with error bars) for 6-DoF estimation across varied textures and lighting. This directly bears on the central claim, as modest perturbations can easily eliminate the structured signals needed for reliable pose regression once printing quantization, camera noise, and real-world lighting are introduced.

minor comments (1)

The abstract and introduction would benefit from a concise statement of the encoder and detector network architectures and training objectives to improve clarity for readers.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback on our manuscript. We address the major comment regarding the experimental section below and agree that additional quantitative analysis will strengthen the presentation of our results.

read point-by-point responses

Referee: [Experiments] The experimental description provides no quantitative perceptual metrics (e.g., SSIM, LPIPS, or user studies) to verify that the neural alterations remain modest enough to blend naturally, nor failure-case analysis or accuracy metrics (with error bars) for 6-DoF estimation across varied textures and lighting. This directly bears on the central claim, as modest perturbations can easily eliminate the structured signals needed for reliable pose regression once printing quantization, camera noise, and real-world lighting are introduced.

Authors: We agree that the current experiments would be strengthened by quantitative support for both the perceptual blending and the tracking accuracy claims. In the revised manuscript we will add SSIM and LPIPS scores computed between the original textures and the generated Ninja Codes to provide an objective measure of visual modesty. We will also include results from a user study in which participants rate the naturalness of the markers when placed in diverse indoor scenes. For 6-DoF estimation we will report mean pose errors with standard error bars across multiple textures and lighting conditions, together with a dedicated failure-case analysis that examines cases where tracking degrades due to printing artifacts, camera noise, or extreme lighting. These additions will be placed in an expanded Experiments section and will directly address the concern that modest perturbations may not survive real-world imaging conditions. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained via neural training and empirical tests

full rationale

The paper describes an encoder network that converts arbitrary images into Ninja Codes via modest visual alterations, with the stealthy 6-DoF tracking capability shown through experiments on printed markers under indoor lighting. No load-bearing steps reduce by construction to inputs: there are no self-definitional equations, fitted parameters renamed as predictions, or self-citation chains that justify the core claim. The abstract and description indicate a standard generative ML pipeline whose outputs are validated externally rather than assumed by definition.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review limits visibility into training details; the approach implicitly relies on standard neural network assumptions for image-to-marker conversion and detection without explicit free parameters or invented entities stated.

axioms (1)

domain assumption Neural networks can be trained to apply visually modest alterations that preserve detectability for pose estimation.
Invoked in the description of the encoder network converting arbitrary images into Ninja Codes.

pith-pipeline@v0.9.0 · 5676 in / 1202 out tokens · 28204 ms · 2026-05-18T04:28:11.099465+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

An encoder network converts arbitrary images into Ninja Codes by applying visually modest alterations... jointly train a series of network modules that perform the creation and detection of Ninja Codes... differentiable noise functions... Image Loss, Regression Loss, Keypoint Loss, Message Loss, Adversary Loss
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean alpha_pin_under_high_calibration unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We employ a two-phase training process... Phase 1... Phase 2... weights w_i progressively increased

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages

[1]

and Padmanabhan, V

Bahl, V. and Padmanabhan, V. RADAR: An In-Building RF-based User Location and Tracking System. In Proc. of IEEE INFOCOM 2000, 775–784

work page 2000
[2]

Hiding Images in Plain Sight: Deep Steganography

Baluja, S. Hiding Images in Plain Sight: Deep Steganography. In Proc. of NeurIPS 2017, 2066–2076

work page 2017
[3]

RUNE-Tag: A High Accuracy Fiducial Marker with Strong Occlusion Resilience

Bergamasco, F., Albarelli, A., Rodolà, E., Torsello, A. RUNE-Tag: A High Accuracy Fiducial Marker with Strong Occlusion Resilience. In Proc. of CVPR 2011, 113– 120

work page 2011
[4]

AI-IMU Dead-Reckoning

Brossard, M., Barrau, A., Bonnabel, S. AI-IMU Dead-Reckoning. IEEE Trans. on Intelligent Vehicles 5 (4), 585–595. 2019

work page 2019
[5]

Human Pose Estimation via Convolutional Part Heatmap Regression

Bulat, A., and Tzimiropoulos, G. Human Pose Estimation via Convolutional Part Heatmap Regression. In Proc. of ECCV 2016, 717–732

work page 2016
[6]

Cadena, C., Carlone, L., Carrillo, H., Latif, Y., Scaramuzza, D., Neira, J., Reid, I., Leonard, J. J. Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age. IEEE Trans. on Robotics 32 (6), 1309–1332. 2016

work page 2016
[7]

Describing Textures in the Wild

Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., Vedaldi, A. Describing Textures in the Wild. In Proc. of CVPR 2014, 3606–3613

work page 2014
[8]

ChromaTag: A Colored Marker and Fast Detection Algorithm

DeGol, J., Bretl, T., Hoiem, D. ChromaTag: A Colored Marker and Fast Detection Algorithm. In Proc. of ICCV 2017, 1472–1481

work page 2017
[9]

D., Garcia-Martin, R., Haertel, P

Dogan, M. D., Garcia-Martin, R., Haertel, P. W., O’Keefe, J. J., Taka, A., Aurora, A., Sanchez-Reillo, R., Mueller, S. BrightMarker: 3D Printed Fluorescent Markers for Object Tracking. In Proc. of UIST 2023, 1–13

work page 2023
[10]

A Fiducial Marker System Using Digital Techniques

Fiala, M. A Fiducial Marker System Using Digital Techniques. In Proc. of CVPR 2005, 590–596

work page 2005
[11]

B., Hoff, W

Gatrell, L. B., Hoff, W. A., Sklair, C. Robust Image Features: Concentric Contrast- ing Circles and Their Image Extraction. In Proc. of SPIE 1612, 235–245

work page
[12]

Flight Assembled Architecture

Gramazio, F., Kohler, M., d’Andrea, R. Flight Assembled Architecture. Editions HYX. 2012

work page 2012
[13]

Learnable Visual Markers

Grinchuk, O., Lebedev, V., Lempitsky, V. Learnable Visual Markers. In Proc. of NeurIPS 2016, 4150–4158

work page 2016
[14]

A Photometric Approach to Digitizing Cul- tural Artifacts

Hawkins, T., Cohen, J., Debevec, P. A Photometric Approach to Digitizing Cul- tural Artifacts. In Proc. of VAST 2001, 333–342

work page 2001
[15]

Deep Residual Learning for Image Recognition

He, K., Zhang, X., Ren, S., Sun, J. Deep Residual Learning for Image Recognition. In Proc. of CVPR 2016, 770–778

work page 2016
[16]

Advanced Self-Contained Object Removal for Realizing Real- Time Diminished Reality in Unconstrained Environments

Herling, J., Broll, W. Advanced Self-Contained Object Removal for Realizing Real- Time Diminished Reality in Unconstrained Environments. In Proc. of ISMAR 2010, 207–212

work page 2010
[17]

Deep ChArUco: Dark ChArUco Marker Pose Estimation

Hu, D., DeTone, D., Chauhan, V., Spivak, I., Malisiewicz, T. Deep ChArUco: Dark ChArUco Marker Pose Estimation. In Proc. of CVPR 2019, 8428–8436

work page 2019
[18]

Spatial Transformer Networks

Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K. Spatial Transformer Networks. In Proc. of NeurIPS 2015, 2017–2025

work page 2015
[19]

Learning Invisible Markers for Hidden Codes in Offline-to-Online Photography

Jia, J., Gao, Z., Zhu, D., Min, X., Zhai, G., Yang, X. Learning Invisible Markers for Hidden Codes in Offline-to-Online Photography. In Proc. of CVPR 2022, 2273–2282

work page 2022
[20]

The reacTable: Exploring the Synergy between Live Music Performance and Tabletop Tangible Interfaces

Jordà, S., Geiger, G., Alonso, M., Kaltenbrunner, M. The reacTable: Exploring the Synergy between Live Music Performance and Tabletop Tangible Interfaces. In Proc. of TEI 2007, 139–146

work page 2007
[21]

Kingma, D. P. and Ba, J. Adam: A Method for Stochastic Optimization. In Proc. of ICLR 2015, 1–15

work page 2015
[22]

S., Nayar, S

Li, D., Nair, A. S., Nayar, S. K., Zheng, C. AirCode: Unobtrusive Physical Tags for Digital Fabrication. In Proc. of UIST 2017, 449–460

work page 2017
[23]

L., Dollár., P

Lin, T., Maire, M., Belongie, S., Bourdev, L., Girshick, R., Hays, J., Perona, P., Ramanan, D., Zitnick, C. L., Dollár., P. Microsoft COCO: Common Objects in Context. In Proc. of ECCV 2014, 740–755

work page 2014
[24]

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C., Berg, A. C. SSD: Single Shot MultiBox Detector. In Proc. of ECCV 2016, 21–37

work page 2016
[25]

Distinctive Image Features from Scale-Invariant Keypoints

Lowe, D. Distinctive Image Features from Scale-Invariant Keypoints. Intl. J. of Computer Vision 60 (2), 91–110. 2004

work page 2004
[26]

Mann, S. Fung, J. EyeTap Devices for Augmented, Deliberately Diminished, or Otherwise Altered Visual Perception of Rigid Planar Patches of Real-World Scenes. Presence 11 (2), 158–175. 2002

work page 2002
[27]

AprilTag: A Robust and Flexible Visual Fiducial System

Olson, E. AprilTag: A Robust and Flexible Visual Fiducial System. In Proc. of ICRA 2011, 3400–3407

work page 2011
[28]

B., Psota, E

Peace, J. B., Psota, E. T., Liu, Y., Pérez, L. E2ETag: An End-to-End Trainable Method for Generating and Detecting Fiducial Markers. In Proc. of BMVC 2020

work page 2020
[29]

Affordable Infrared-Optical Pose-Tracking for Virtual and Augmented Reality

Pintaric, T., Kaufmann, H. Affordable Infrared-Optical Pose-Tracking for Virtual and Augmented Reality. In Proc. of IEEE VR 2007 Workshop on Trends and Issues in Tracking for Virtual Environments, 44–51

work page 2007
[30]

B., Chakraborty, A., Balakrishnan, H

Priyantha, N. B., Chakraborty, A., Balakrishnan, H. The Cricket Location-Support System. In Proc. of ACM MOBICOM 2000, 32–43

work page 2000
[31]

You Only Look Once: Unified, Real-Time Object Detection

Redmon, J., Divvala, S., Girshick, R., Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proc. of CVPR 2016, 779–788

work page 2016
[32]

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Ren, S., He, K., Girshick, R., Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. on Pattern Analysis and Machine Intelligence 39 (6), 1137–1149. 2016

work page 2016
[33]

J., Muñoz-Salinas, R., Medina-Carnicer, R

Romero-Ramirez, F. J., Muñoz-Salinas, R., Medina-Carnicer, R. Speeded Up De- tection of Squared Fiducial Markers. Image and Vision Computing 76, 38–47. 2018

work page 2018
[34]

U-Net: Convolutional Networks for Biomed- ical Image Segmentation

Ronneberger, O., Fischer, P., Brox, T. U-Net: Convolutional Networks for Biomed- ical Image Segmentation. In Proc. of MICCAI 2015, 234–241

work page 2015
[35]

and Zisserman, A

Simonyan, K. and Zisserman, A. Very Deep Convolutional Networks for Large- Scale Image Recognition. In Proc. of ICLR 2015, 1–14

work page 2015
[36]

and Perlin, K

Takeuchi, Y. and Perlin, K. ClayVision: The (Elastic) Image of the City. In Proc. of CHI 2012, 2411–2420

work page 2012
[37]

StegaStamp: Invisible Hyperlinks in Physical Photographs

Tancik, M., Mildenhall, B., Ng, R. StegaStamp: Invisible Hyperlinks in Physical Photographs. In Proc. of CVPR 2020, 2117–2126

work page 2020
[38]

Willis, K. D. D., Shiratori, T., Mahler, M. HideOut: Mobile Projector Interaction with Tangible Objects and Surfaces. In Proc. of TEI 2013, 331–338

work page 2013
[39]

ART-UP:A Novel Method for Generating Scanning-Robust Aesthetic QR Codes

Xu, M., Li, Q., Niu, J., Liu, X., Xu, W., Lv, P., Zhou, B. ART-UP:A Novel Method for Generating Scanning-Robust Aesthetic QR Codes. ACM Trans. on Multimedia Computing, Communications and Applications 17 (1), 1–23. 2021

work page 2021
[40]

Stylized Aesthetic QR Code

Xu, M., Su, H., Li, Y., Li, X., Liao, J., Niu, J., Lv, P., Zhou, B. Stylized Aesthetic QR Code. IEEE Transl on Multimedia 21 (8), 1960–1970. 2019

work page 1960
[41]

B., Meuleman, A., Jang, H., Ha, H., Kim, M

Yaldiz, M. B., Meuleman, A., Jang, H., Ha, H., Kim, M. H. DeepFormableTag: End-to-end Generation and Recognition of Deformable Fiducial Markers. ACM Trans. on Graphics 40 (4), Article 67. 2021

work page 2021
[42]

ARTcode: Preserve Art and Code In Any Image

Yang, Z., Bao, Y., Luo, C., Zhao, X., Zhu, S., Peng, C., Liu, Y., Wang, X. ARTcode: Preserve Art and Code In Any Image. In Proc. of UbiComp 2016, 904–915

work page 2016
[43]

A., Shechtman, E., Wang, O

Zhang, R., Isola, P., Efros, A. A., Shechtman, E., Wang, O.. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In Proc. of CVPR 2018, 586–595

work page 2018
[44]

HiDDeN: Hiding Data with Deep Networks

Zhu, J., Kaplan, R., Johnson, J., Li, F.. HiDDeN: Hiding Data with Deep Networks. In Proc. of ECCV 2018, 657–672

work page 2018

[1] [1]

and Padmanabhan, V

Bahl, V. and Padmanabhan, V. RADAR: An In-Building RF-based User Location and Tracking System. In Proc. of IEEE INFOCOM 2000, 775–784

work page 2000

[2] [2]

Hiding Images in Plain Sight: Deep Steganography

Baluja, S. Hiding Images in Plain Sight: Deep Steganography. In Proc. of NeurIPS 2017, 2066–2076

work page 2017

[3] [3]

RUNE-Tag: A High Accuracy Fiducial Marker with Strong Occlusion Resilience

Bergamasco, F., Albarelli, A., Rodolà, E., Torsello, A. RUNE-Tag: A High Accuracy Fiducial Marker with Strong Occlusion Resilience. In Proc. of CVPR 2011, 113– 120

work page 2011

[4] [4]

AI-IMU Dead-Reckoning

Brossard, M., Barrau, A., Bonnabel, S. AI-IMU Dead-Reckoning. IEEE Trans. on Intelligent Vehicles 5 (4), 585–595. 2019

work page 2019

[5] [5]

Human Pose Estimation via Convolutional Part Heatmap Regression

Bulat, A., and Tzimiropoulos, G. Human Pose Estimation via Convolutional Part Heatmap Regression. In Proc. of ECCV 2016, 717–732

work page 2016

[6] [6]

Cadena, C., Carlone, L., Carrillo, H., Latif, Y., Scaramuzza, D., Neira, J., Reid, I., Leonard, J. J. Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age. IEEE Trans. on Robotics 32 (6), 1309–1332. 2016

work page 2016

[7] [7]

Describing Textures in the Wild

Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., Vedaldi, A. Describing Textures in the Wild. In Proc. of CVPR 2014, 3606–3613

work page 2014

[8] [8]

ChromaTag: A Colored Marker and Fast Detection Algorithm

DeGol, J., Bretl, T., Hoiem, D. ChromaTag: A Colored Marker and Fast Detection Algorithm. In Proc. of ICCV 2017, 1472–1481

work page 2017

[9] [9]

D., Garcia-Martin, R., Haertel, P

Dogan, M. D., Garcia-Martin, R., Haertel, P. W., O’Keefe, J. J., Taka, A., Aurora, A., Sanchez-Reillo, R., Mueller, S. BrightMarker: 3D Printed Fluorescent Markers for Object Tracking. In Proc. of UIST 2023, 1–13

work page 2023

[10] [10]

A Fiducial Marker System Using Digital Techniques

Fiala, M. A Fiducial Marker System Using Digital Techniques. In Proc. of CVPR 2005, 590–596

work page 2005

[11] [11]

B., Hoff, W

Gatrell, L. B., Hoff, W. A., Sklair, C. Robust Image Features: Concentric Contrast- ing Circles and Their Image Extraction. In Proc. of SPIE 1612, 235–245

work page

[12] [12]

Flight Assembled Architecture

Gramazio, F., Kohler, M., d’Andrea, R. Flight Assembled Architecture. Editions HYX. 2012

work page 2012

[13] [13]

Learnable Visual Markers

Grinchuk, O., Lebedev, V., Lempitsky, V. Learnable Visual Markers. In Proc. of NeurIPS 2016, 4150–4158

work page 2016

[14] [14]

A Photometric Approach to Digitizing Cul- tural Artifacts

Hawkins, T., Cohen, J., Debevec, P. A Photometric Approach to Digitizing Cul- tural Artifacts. In Proc. of VAST 2001, 333–342

work page 2001

[15] [15]

Deep Residual Learning for Image Recognition

He, K., Zhang, X., Ren, S., Sun, J. Deep Residual Learning for Image Recognition. In Proc. of CVPR 2016, 770–778

work page 2016

[16] [16]

Advanced Self-Contained Object Removal for Realizing Real- Time Diminished Reality in Unconstrained Environments

Herling, J., Broll, W. Advanced Self-Contained Object Removal for Realizing Real- Time Diminished Reality in Unconstrained Environments. In Proc. of ISMAR 2010, 207–212

work page 2010

[17] [17]

Deep ChArUco: Dark ChArUco Marker Pose Estimation

Hu, D., DeTone, D., Chauhan, V., Spivak, I., Malisiewicz, T. Deep ChArUco: Dark ChArUco Marker Pose Estimation. In Proc. of CVPR 2019, 8428–8436

work page 2019

[18] [18]

Spatial Transformer Networks

Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K. Spatial Transformer Networks. In Proc. of NeurIPS 2015, 2017–2025

work page 2015

[19] [19]

Learning Invisible Markers for Hidden Codes in Offline-to-Online Photography

Jia, J., Gao, Z., Zhu, D., Min, X., Zhai, G., Yang, X. Learning Invisible Markers for Hidden Codes in Offline-to-Online Photography. In Proc. of CVPR 2022, 2273–2282

work page 2022

[20] [20]

The reacTable: Exploring the Synergy between Live Music Performance and Tabletop Tangible Interfaces

Jordà, S., Geiger, G., Alonso, M., Kaltenbrunner, M. The reacTable: Exploring the Synergy between Live Music Performance and Tabletop Tangible Interfaces. In Proc. of TEI 2007, 139–146

work page 2007

[21] [21]

Kingma, D. P. and Ba, J. Adam: A Method for Stochastic Optimization. In Proc. of ICLR 2015, 1–15

work page 2015

[22] [22]

S., Nayar, S

Li, D., Nair, A. S., Nayar, S. K., Zheng, C. AirCode: Unobtrusive Physical Tags for Digital Fabrication. In Proc. of UIST 2017, 449–460

work page 2017

[23] [23]

L., Dollár., P

Lin, T., Maire, M., Belongie, S., Bourdev, L., Girshick, R., Hays, J., Perona, P., Ramanan, D., Zitnick, C. L., Dollár., P. Microsoft COCO: Common Objects in Context. In Proc. of ECCV 2014, 740–755

work page 2014

[24] [24]

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C., Berg, A. C. SSD: Single Shot MultiBox Detector. In Proc. of ECCV 2016, 21–37

work page 2016

[25] [25]

Distinctive Image Features from Scale-Invariant Keypoints

Lowe, D. Distinctive Image Features from Scale-Invariant Keypoints. Intl. J. of Computer Vision 60 (2), 91–110. 2004

work page 2004

[26] [26]

Mann, S. Fung, J. EyeTap Devices for Augmented, Deliberately Diminished, or Otherwise Altered Visual Perception of Rigid Planar Patches of Real-World Scenes. Presence 11 (2), 158–175. 2002

work page 2002

[27] [27]

AprilTag: A Robust and Flexible Visual Fiducial System

Olson, E. AprilTag: A Robust and Flexible Visual Fiducial System. In Proc. of ICRA 2011, 3400–3407

work page 2011

[28] [28]

B., Psota, E

Peace, J. B., Psota, E. T., Liu, Y., Pérez, L. E2ETag: An End-to-End Trainable Method for Generating and Detecting Fiducial Markers. In Proc. of BMVC 2020

work page 2020

[29] [29]

Affordable Infrared-Optical Pose-Tracking for Virtual and Augmented Reality

Pintaric, T., Kaufmann, H. Affordable Infrared-Optical Pose-Tracking for Virtual and Augmented Reality. In Proc. of IEEE VR 2007 Workshop on Trends and Issues in Tracking for Virtual Environments, 44–51

work page 2007

[30] [30]

B., Chakraborty, A., Balakrishnan, H

Priyantha, N. B., Chakraborty, A., Balakrishnan, H. The Cricket Location-Support System. In Proc. of ACM MOBICOM 2000, 32–43

work page 2000

[31] [31]

You Only Look Once: Unified, Real-Time Object Detection

Redmon, J., Divvala, S., Girshick, R., Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proc. of CVPR 2016, 779–788

work page 2016

[32] [32]

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Ren, S., He, K., Girshick, R., Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. on Pattern Analysis and Machine Intelligence 39 (6), 1137–1149. 2016

work page 2016

[33] [33]

J., Muñoz-Salinas, R., Medina-Carnicer, R

Romero-Ramirez, F. J., Muñoz-Salinas, R., Medina-Carnicer, R. Speeded Up De- tection of Squared Fiducial Markers. Image and Vision Computing 76, 38–47. 2018

work page 2018

[34] [34]

U-Net: Convolutional Networks for Biomed- ical Image Segmentation

Ronneberger, O., Fischer, P., Brox, T. U-Net: Convolutional Networks for Biomed- ical Image Segmentation. In Proc. of MICCAI 2015, 234–241

work page 2015

[35] [35]

and Zisserman, A

Simonyan, K. and Zisserman, A. Very Deep Convolutional Networks for Large- Scale Image Recognition. In Proc. of ICLR 2015, 1–14

work page 2015

[36] [36]

and Perlin, K

Takeuchi, Y. and Perlin, K. ClayVision: The (Elastic) Image of the City. In Proc. of CHI 2012, 2411–2420

work page 2012

[37] [37]

StegaStamp: Invisible Hyperlinks in Physical Photographs

Tancik, M., Mildenhall, B., Ng, R. StegaStamp: Invisible Hyperlinks in Physical Photographs. In Proc. of CVPR 2020, 2117–2126

work page 2020

[38] [38]

Willis, K. D. D., Shiratori, T., Mahler, M. HideOut: Mobile Projector Interaction with Tangible Objects and Surfaces. In Proc. of TEI 2013, 331–338

work page 2013

[39] [39]

ART-UP:A Novel Method for Generating Scanning-Robust Aesthetic QR Codes

Xu, M., Li, Q., Niu, J., Liu, X., Xu, W., Lv, P., Zhou, B. ART-UP:A Novel Method for Generating Scanning-Robust Aesthetic QR Codes. ACM Trans. on Multimedia Computing, Communications and Applications 17 (1), 1–23. 2021

work page 2021

[40] [40]

Stylized Aesthetic QR Code

Xu, M., Su, H., Li, Y., Li, X., Liao, J., Niu, J., Lv, P., Zhou, B. Stylized Aesthetic QR Code. IEEE Transl on Multimedia 21 (8), 1960–1970. 2019

work page 1960

[41] [41]

B., Meuleman, A., Jang, H., Ha, H., Kim, M

Yaldiz, M. B., Meuleman, A., Jang, H., Ha, H., Kim, M. H. DeepFormableTag: End-to-end Generation and Recognition of Deformable Fiducial Markers. ACM Trans. on Graphics 40 (4), Article 67. 2021

work page 2021

[42] [42]

ARTcode: Preserve Art and Code In Any Image

Yang, Z., Bao, Y., Luo, C., Zhao, X., Zhu, S., Peng, C., Liu, Y., Wang, X. ARTcode: Preserve Art and Code In Any Image. In Proc. of UbiComp 2016, 904–915

work page 2016

[43] [43]

A., Shechtman, E., Wang, O

Zhang, R., Isola, P., Efros, A. A., Shechtman, E., Wang, O.. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In Proc. of CVPR 2018, 586–595

work page 2018

[44] [44]

HiDDeN: Hiding Data with Deep Networks

Zhu, J., Kaplan, R., Johnson, J., Li, F.. HiDDeN: Hiding Data with Deep Networks. In Proc. of ECCV 2018, 657–672

work page 2018