Low-Latency Video Anonymization for Crowd Anomaly Detection: Privacy Versus Performance

Christian Walter Omlin; Lei Jiao; Mulugeta Weldezgina Asres

arxiv: 2410.18717 · v2 · submitted 2024-10-24 · 💻 cs.CV · cs.AI

Low-Latency Video Anonymization for Crowd Anomaly Detection: Privacy Versus Performance

Mulugeta Weldezgina Asres , Lei Jiao , Christian Walter Omlin This is my paper

Pith reviewed 2026-05-23 18:22 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords video anonymizationanomaly detectionprivacy protectionreal-time processingLA3Dsurveillance video

0 comments

The pith

The LA3D method substantially improves privacy in video anomaly detection without severely degrading detection performance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes LA3D, a lightweight adaptive anonymization technique for real-time video anomaly detection. It uses dynamic adjustment to strengthen full-body privacy protection in surveillance videos. Evaluations on public datasets show that LA3D raises privacy levels while keeping most of the utility for anomaly detection intact. The method outperforms both conventional anonymization techniques and deep learning approaches on the privacy-performance trade-off.

Core claim

LA3D enables substantial improvement in privacy anonymization without severely degrading VAD efficacy, outperforming conventional and deep learning approaches.

What carries the argument

LA3D, a lightweight adaptive anonymization for VAD that employs dynamic adjustment to enhance full-body privacy protection.

Load-bearing premise

The dynamic adjustment mechanism in LA3D achieves full-body privacy protection and that the evaluations on public datasets generalize to real-world VAD applications.

What would settle it

A real-world test showing that LA3D leaves identifiable body features visible or causes a large drop in anomaly detection accuracy compared to non-anonymized video.

Figures

Figures reproduced from arXiv: 2410.18717 by Christian Walter Omlin, Lei Jiao, Mulugeta Weldezgina Asres.

**Figure 2.** Figure 2: The privacy attribute class distribution of the VISPR train set. The estimated relative label distribution [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: Visual comparison of different AN methods on sample images from the VISPR dataset: (left to [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Impact of the αr on images with different resolutions from the VISPR dataset. (Top to bottom) image resolution Z: [160 × 120], [320 × 240], and [1280 × 960]. (Left to right) 1:RAW_IMAGE, 2:PIXELIZED_D4_A (αb = 0.5, αr = 0.5), 3:PIXELIZED_D4_A (αb = 0.5, αr = Z/Zref), 4:PIXELIZED_A (ismax = T rue, Da = Zb), 5:BLURRED_A (αb = 0.5, αr = 0.5), 6:BLURRED_A (αb = 0.5, αr = Z/Zref), 7:BLURRED_A (αb = 0.5, αr = Z/… view at source ↗

**Figure 5.** Figure 5: Images consisting of persons at different depths on sample images from the VISPR dataset: (left to right) [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: PD per each privacy attribute on the VISPR dataset. [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗

**Figure 7.** Figure 7: Sample images with a person holding an object, ReID leakage after AN due to the induced match through [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗

**Figure 8.** Figure 8: VAD versus PD performance of the AN methods with a) the PEL4VAD, and b) the MGFN models. [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗

read the original abstract

Recent advancements in artificial intelligence hold ample potential for monitoring applications using surveillance cameras. However, concerns about privacy and model bias have made it challenging to utilize them in public. Although de-identification approaches have been proposed in the literature, aiming to achieve a certain level of anonymization (AN), most of them employ deep learning models that are computationally demanding for real-time edge deployment. This study revisits conventional AN solutions for privacy protection and real-time video anomaly detection (VAD) applications. We propose a lightweight adaptive AN for VAD (LA3D) that employs dynamic adjustment to enhance full-body privacy protection. We have evaluated privacy protection and VAD utility retention efficacy using several publicly available datasets to examine the strengths and weaknesses of different AN methods and highlight the promising leverage of our approach. Our experiment demonstrates that the LA3D enables substantial improvement in privacy AN without severely degrading VAD efficacy, outperforming conventional and deep learning approaches. Code is available at https://github.com/muleina/LA3D .

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The abstract claims LA3D gives substantial privacy gains without hurting VAD performance but supplies no metrics, baselines, or numbers to check it.

read the letter

The paper introduces LA3D as a lightweight adaptive anonymization method for video anomaly detection. It uses dynamic adjustment to strengthen full-body privacy while aiming to keep detection effective, and it revisits conventional techniques rather than defaulting to heavy deep learning models for edge deployment. The authors report testing on public datasets and state that their approach improves privacy without severe loss in VAD utility and outperforms both conventional and deep learning baselines. The GitHub link for the code is a concrete positive step toward reproducibility.

Referee Report

1 major / 0 minor

Summary. The paper proposes LA3D, a lightweight adaptive anonymization method for video anomaly detection (VAD) that uses dynamic adjustment to enhance full-body privacy protection. It claims evaluations on several public datasets show substantial privacy AN gains without severely degrading VAD efficacy and that LA3D outperforms conventional and deep learning approaches. Code is made available via GitHub.

Significance. If the claimed results hold, this would be significant for practical edge deployment of privacy-preserving VAD systems, as it targets the computational barriers of deep learning anonymization while addressing privacy-utility tradeoffs in real-time surveillance, a key enabler for public monitoring applications.

major comments (1)

[Abstract] Abstract: The central claim that 'Our experiment demonstrates that the LA3D enables substantial improvement in privacy AN without severely degrading VAD efficacy, outperforming conventional and deep learning approaches' is presented with no supporting evidence. No privacy metric, VAD metric (e.g., AUC), datasets, baselines, numerical deltas, statistical tests, or description of the dynamic adjustment algorithm are supplied, rendering the tradeoff and superiority assertions unverifiable and unsupported.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for highlighting the need for greater specificity in the abstract. We address the concern directly below.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that 'Our experiment demonstrates that the LA3D enables substantial improvement in privacy AN without severely degrading VAD efficacy, outperforming conventional and deep learning approaches' is presented with no supporting evidence. No privacy metric, VAD metric (e.g., AUC), datasets, baselines, numerical deltas, statistical tests, or description of the dynamic adjustment algorithm are supplied, rendering the tradeoff and superiority assertions unverifiable and unsupported.

Authors: We agree that the abstract as written is too high-level and does not include the requested quantitative details or method description, which limits verifiability from the abstract alone. The full manuscript contains the missing elements: privacy metrics (AN scores), VAD performance (AUC), specific public datasets, baseline comparisons (conventional and deep-learning methods), numerical deltas, and the dynamic adjustment algorithm. To strengthen the abstract, we will revise it to incorporate key results and a brief method outline while respecting length constraints. revision: yes

Circularity Check

0 steps flagged

No circularity; experimental claims rest on external dataset evaluations rather than self-referential definitions or fits.

full rationale

The paper's central claim is an empirical assertion that LA3D improves privacy AN while retaining VAD efficacy and outperforming baselines, evaluated on public datasets. No equations, parameters fitted to target metrics, self-citations, uniqueness theorems, or ansatzes appear in the provided abstract. The derivation chain is therefore absent; the result is presented as the outcome of independent experiments rather than a quantity forced by construction from its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract does not specify any free parameters, axioms, or invented entities underlying the proposed method.

pith-pipeline@v0.9.0 · 5687 in / 905 out tokens · 57122 ms · 2026-05-23T18:22:45.012935+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose a lightweight adaptive AN for VAD (LA3D) that employs dynamic adjustment to enhance full-body privacy protection... r = max{αr ln(100×∥m∥/∥I∥),1}
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We employ the state-of-the-art WSAD VAD models, such as the prompt-enhanced learning for VAD (PEL4VAD)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages · 1 internal anchor

[1]

Towards a visual privacy advisor: understanding and predicting privacy risks in images,

T. Orekondy, B. Schiele, and M. Fritz, “Towards a visual privacy advisor: understanding and predicting privacy risks in images,” inInt. Conf. on Computer Vision. IEEE, 2017, pp. 3686–3695

work page 2017
[2]

Balancing privacy and accuracy: exploring the impact of data anonymization on deep learning models in computer vision,

J. H. Lee and S. J. You, “Balancing privacy and accuracy: exploring the impact of data anonymization on deep learning models in computer vision,”IEEE Access, 2024

work page 2024
[3]

Towards privacy-preserving visual recognition via adversarial training: A pilot study,

Z. Wu, Z. Wang, Z. Wang, and H. Jin, “Towards privacy-preserving visual recognition via adversarial training: A pilot study,” in European Conf. on Computer Vision. IEEE, 2018, pp. 606–624

work page 2018
[4]

Privacy–enhancing face biometrics: a comprehensive survey,

B. Meden, P. Rot, P. Terhörst, N. Damer, A. Kuijper, W. J. Scheirer, A. Ross, P. Peer, and V . Štruc, “Privacy–enhancing face biometrics: a comprehensive survey,” IEEE Transactions on Information Forensics and Security , vol. 16, pp. 4147–4183, 2021

work page 2021
[5]

TeD-SPAD: temporal distinctiveness for self-supervised privacy-preservation for video anomaly detection,

J. Fioresi, I. R. Dave, and M. Shah, “TeD-SPAD: temporal distinctiveness for self-supervised privacy-preservation for video anomaly detection,”arXiv preprint arXiv:2308.11072, 2023

work page arXiv 2023
[6]

On the dangers of stochastic parrots: Can language models be too big?

E. M. Bender, T. Gebru, A. McMillan-Major, and S. Shmitchell, “On the dangers of stochastic parrots: Can language models be too big?” in Conf. on Fairness, Accountability, and Transparency. ACM, 2021, pp. 610–623

work page 2021
[7]

SPAct: self-supervised privacy preservation for action recognition,

I. R. Dave, C. Chen, and M. Shah, “SPAct: self-supervised privacy preservation for action recognition,” inConf. on Computer Vision and Pattern Recognition. IEEE, 2022, pp. 20 164–20 173

work page 2022
[8]

A user study on anonymization techniques for smart video surveillance,

P. Birnstill, D. Ren, and J. Beyerer, “A user study on anonymization techniques for smart video surveillance,” in 12th Int. Conf. on Advanced Video and Signal Based Surveillance. IEEE, 2015, pp. 1–6

work page 2015
[9]

Learning to anonymize faces for privacy preserving action detection,

Z. Ren, Y . J. Lee, and M. S. Ryoo, “Learning to anonymize faces for privacy preserving action detection,” inEuropean Conf. on Computer Vision, 2018, pp. 620–636

work page 2018
[10]

Real-time video anonymization in smart city intersections,

A. Angus, Z. Duan, G. Zussman, and Z. Kosti ´c, “Real-time video anonymization in smart city intersections,” in 19th Int. Conf. on Mobile Ad Hoc and Smart Systems. IEEE, 2022, pp. 514–522

work page 2022
[11]

EgoBlur: responsible innovation in Aria,

N. Raina, G. Somasundaram, K. Zheng, S. Saarinen, J. Messiner, M. Schwesinger, L. Pesqueira, I. Prasad, E. Miller, P. Gupta et al., “EgoBlur: responsible innovation in Aria,”arXiv preprint arXiv:2308.13093, 2023

work page arXiv 2023
[12]

I know that person: generative full body and face de-identification of people in images,

K. Brkic, I. Sikiric, T. Hrkac, and Z. Kalafatic, “I know that person: generative full body and face de-identification of people in images,” inConf. on Computer Vision and Pattern Recognition Workshops. IEEE, 2017, pp. 1319–1328

work page 2017
[13]

Deepprivacy: a generative adversarial network for face anonymization,

H. Hukkelås, R. Mester, and F. Lindseth, “Deepprivacy: a generative adversarial network for face anonymization,” in Int. Symposium on Visual Computing. Springer, 2019, pp. 565–578

work page 2019
[14]

Deep autoencoders for attribute preserving face de-identification,

P. Nousi, S. Papadopoulos, A. Tefas, and I. Pitas, “Deep autoencoders for attribute preserving face de-identification,” Signal Processing: Image Communication, vol. 81, p. 115699, 2020

work page 2020
[15]

CIAGAN: conditional identity anonymization generative adversarial networks,

M. Maximov, I. Elezi, and L. Leal-Taixé, “CIAGAN: conditional identity anonymization generative adversarial networks,” in Conf. on Computer Vision and Pattern Recognition. IEEE, 2020, pp. 5447–5456

work page 2020
[16]

DeepBlur: a simple and effective method for natural image obfuscation,

T. Li and M. S. Choi, “DeepBlur: a simple and effective method for natural image obfuscation,” arXiv preprint arXiv:2104.02655, vol. 1, p. 3, 2021

work page arXiv 2021
[17]

A3GAN: attribute-aware anonymization networks for face de- identification,

L. Zhai, Q. Guo, X. Xie, L. Ma, Y . E. Wang, and Y . Liu, “A3GAN: attribute-aware anonymization networks for face de- identification,” in30th Int. Conf. on Multimedia. ACM, 2022, pp. 5303–5313

work page 2022
[18]

MeDM: mediating image diffusion models for video-to-video translation with temporal correspondence guidance,

E. Chu, T. Huang, S.-Y . Lin, and J.-C. Chen, “MeDM: mediating image diffusion models for video-to-video translation with temporal correspondence guidance,”arXiv preprint arXiv:2308.10079, 2023

work page arXiv 2023
[19]

Deepprivacy2: towards realistic full-body anonymization,

H. Hukkelås and F. Lindseth, “Deepprivacy2: towards realistic full-body anonymization,” in Conf. on Applications of Com- puter Vision. IEEE, 2023, pp. 1329–1338

work page 2023
[20]

Prime: privacy-preserving video anomaly detection via motion exemplar guidance,

Y . Su, H. Zhu, Y . Tan, S. An, and M. Xing, “Prime: privacy-preserving video anomaly detection via motion exemplar guidance,”Knowledge-Based Systems, vol. 278, p. 110872, 2023

work page 2023
[21]

Privacy-preserving deep action recognition: an adversarial learning frame- work and a new dataset,

Z. Wu, H. Wang, Z. Wang, H. Jin, and Z. Wang, “Privacy-preserving deep action recognition: an adversarial learning frame- work and a new dataset,” IEEE Transactions on Pattern Analysis and Machine Intelligence , vol. 44, no. 4, pp. 2126–2139, 2020

work page 2020
[22]

Disguise without disruption: utility-preserving face de-identification,

Z. Cai, Z. Gao, B. Planche, M. Zheng, T. Chen, M. S. Asif, and Z. Wu, “Disguise without disruption: utility-preserving face de-identification,”arXiv preprint arXiv:2303.13269, 2023

work page arXiv 2023
[23]

Personal privacy protection via irrelevant faces tracking and pixelation in video live streaming,

J. Zhou and C.-M. Pun, “Personal privacy protection via irrelevant faces tracking and pixelation in video live streaming,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 1088–1103, 2020

work page 2020
[24]

Deep crowd anomaly detection: state-of-the-art, challenges, and future research directions,

M. H. Sharif, L. Jiao, and C. W. Omlin, “Deep crowd anomaly detection: state-of-the-art, challenges, and future research directions,”arXiv preprint arXiv:2210.13927, 2022

work page arXiv 2022
[25]

Weakly-supervised video anomaly detection with robust temporal feature magnitude learning,

Y . Tian, G. Pang, Y . Chen, R. Singh, J. W. Verjans, and G. Carneiro, “Weakly-supervised video anomaly detection with robust temporal feature magnitude learning,” inInt. Conf. on Computer Vision. IEEE, 2021, pp. 4975–4986. 15 Low-Latency Video Anonymization for Crowd Anomaly Detection: Privacy vs. Performance

work page 2021
[26]

Learning prompt-enhanced context features for weakly-supervised video anomaly detection,

Y . Pu, X. Wu, L. Yang, and S. Wang, “Learning prompt-enhanced context features for weakly-supervised video anomaly detection,”IEEE Transactions on Image Processing, 2024

work page 2024
[27]

MGFN: magnitude-contrastive glance-and-focus network for weakly-supervised video anomaly detection,

Y . Chen, Z. Liu, B. Zhang, W. Fok, X. Qi, and Y .-C. Wu, “MGFN: magnitude-contrastive glance-and-focus network for weakly-supervised video anomaly detection,” inAAAI Conf. on Artificial Intelligence, vol. 37, no. 1, 2023, pp. 387–395

work page 2023
[28]

CNN-ViT supported weakly-supervised video segment level anomaly detection,

M. H. Sharif, L. Jiao, and C. W. Omlin, “CNN-ViT supported weakly-supervised video segment level anomaly detection,” Sensors, vol. 23, no. 18, p. 7734, 2023

work page 2023
[29]

Deep crowd anomaly detection by fusing reconstruction and prediction networks,

——, “Deep crowd anomaly detection by fusing reconstruction and prediction networks,” Electronics, vol. 12, no. 7, p. 1517, 2023

work page 2023
[30]

Decouple and resolve: transformer-based models for online anomaly detection from weakly labeled videos,

T. Liu, C. Zhang, K.-M. Lam, and J. Kong, “Decouple and resolve: transformer-based models for online anomaly detection from weakly labeled videos,”IEEE Transactions on Information Forensics and Security, vol. 18, pp. 15–28, 2022

work page 2022
[31]

CLIP-TSA: Clip-assisted temporal self-attention for weakly-supervised video anomaly detection,

H. K. Joo, K. V o, K. Yamazaki, and N. Le, “CLIP-TSA: Clip-assisted temporal self-attention for weakly-supervised video anomaly detection,” inIEEE Int. Conf. on Image Processing. IEEE, 2023, pp. 3230–3234

work page 2023
[32]

Learning transferable visual models from natural language supervision,

A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark et al. , “Learning transferable visual models from natural language supervision,” inInt. Conf. on Machine Learning. PMLR, 2021, pp. 8748–8763

work page 2021
[33]

Conceptnet 5.5: an open multilingual graph of general knowledge,

R. Speer, J. Chin, and C. Havasi, “Conceptnet 5.5: an open multilingual graph of general knowledge,” in AAAI Conf. on Artificial Intelligence, vol. 31, no. 1, 2017

work page 2017
[34]

You only look once: unified, real-time object detection,

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: unified, real-time object detection,” in Conf. on Computer Vision and Pattern Recognition. IEEE, 2016, pp. 779–788

work page 2016
[35]

Towards a visual privacy advisor: understanding and predicting privacy risks in images,

T. Orekondy, B. Schiele, and M. Fritz, “Towards a visual privacy advisor: understanding and predicting privacy risks in images,” inInt. Conf. on Computer Vision. IEEE, 2017

work page 2017
[36]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inConf. on Computer Vision and Pattern Recognition. IEEE, 2016, pp. 770–778

work page 2016
[37]

Delving deep into rectifiers: surpassing human-level performance on ImageNet classification,

——, “Delving deep into rectifiers: surpassing human-level performance on ImageNet classification,” in Int. Conf. on Com- puter Vision. IEEE, 2015, pp. 1026–1034

work page 2015
[38]

Real-world anomaly detection in surveillance videos,

W. Sultani, C. Chen, and M. Shah, “Real-world anomaly detection in surveillance videos,” in Conf. on Computer Vision and Pattern Recognition. IEEE, 2018, pp. 6479–6488

work page 2018
[39]

3d convolutional neural networks for human action recognition,

S. Ji, W. Xu, M. Yang, and K. Yu, “3d convolutional neural networks for human action recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 221–231, 2012

work page 2012
[40]

Quo vadis, action recognition? a new model and the kinetics dataset,

J. Carreira and A. Zisserman, “Quo vadis, action recognition? a new model and the kinetics dataset,” in Conf. on Computer Vision and Pattern Recognition. IEEE, 2017, pp. 6299–6308

work page 2017
[41]

The Kinetics Human Action Video Dataset

W. Kay, J. Carreira, K. Simonyan, B. Zhang, C. Hillier, S. Vijayanarasimhan, F. Viola, T. Green, T. Back, P. Natsev et al., “The kinetics human action video dataset,”arXiv preprint arXiv:1705.06950, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[42]

A comprehensive study of deep video action recognition,

Y . Zhu, X. Li, C. Liu, M. Zolfaghari, Y . Xiong, C. Wu, Z. Zhang, J. Tighe, R. Manmatha, and M. Li, “A comprehensive study of deep video action recognition,”arXiv preprint arXiv:2012.06567, 2020

work page arXiv 2012
[43]

A computational approach to edge detection,

J. Canny, “A computational approach to edge detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence , no. 6, pp. 679–698, 1986

work page 1986
[44]

Understanding visual privacy protection: A generalized framework with an instance on facial privacy,

Y . Zhang, J. Ji, W. Wen, Y . Zhu, Z. Xia, and J. Weng, “Understanding visual privacy protection: A generalized framework with an instance on facial privacy,”IEEE Transactions on Information Forensics and Security, 2024

work page 2024
[45]

Scalable person re-identification: a benchmark,

L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang, and Q. Tian, “Scalable person re-identification: a benchmark,” inInt. Conf. on Computer Vision. IEEE, 2015, pp. 1116–1124

work page 2015
[46]

Learning generalisable omni-scale representations for person re- identification,

K. Zhou, Y . Yang, A. Cavallaro, and T. Xiang, “Learning generalisable omni-scale representations for person re- identification,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 9, pp. 5056–5069, 2021. 16

work page 2021

[1] [1]

Towards a visual privacy advisor: understanding and predicting privacy risks in images,

T. Orekondy, B. Schiele, and M. Fritz, “Towards a visual privacy advisor: understanding and predicting privacy risks in images,” inInt. Conf. on Computer Vision. IEEE, 2017, pp. 3686–3695

work page 2017

[2] [2]

Balancing privacy and accuracy: exploring the impact of data anonymization on deep learning models in computer vision,

J. H. Lee and S. J. You, “Balancing privacy and accuracy: exploring the impact of data anonymization on deep learning models in computer vision,”IEEE Access, 2024

work page 2024

[3] [3]

Towards privacy-preserving visual recognition via adversarial training: A pilot study,

Z. Wu, Z. Wang, Z. Wang, and H. Jin, “Towards privacy-preserving visual recognition via adversarial training: A pilot study,” in European Conf. on Computer Vision. IEEE, 2018, pp. 606–624

work page 2018

[4] [4]

Privacy–enhancing face biometrics: a comprehensive survey,

B. Meden, P. Rot, P. Terhörst, N. Damer, A. Kuijper, W. J. Scheirer, A. Ross, P. Peer, and V . Štruc, “Privacy–enhancing face biometrics: a comprehensive survey,” IEEE Transactions on Information Forensics and Security , vol. 16, pp. 4147–4183, 2021

work page 2021

[5] [5]

TeD-SPAD: temporal distinctiveness for self-supervised privacy-preservation for video anomaly detection,

J. Fioresi, I. R. Dave, and M. Shah, “TeD-SPAD: temporal distinctiveness for self-supervised privacy-preservation for video anomaly detection,”arXiv preprint arXiv:2308.11072, 2023

work page arXiv 2023

[6] [6]

On the dangers of stochastic parrots: Can language models be too big?

E. M. Bender, T. Gebru, A. McMillan-Major, and S. Shmitchell, “On the dangers of stochastic parrots: Can language models be too big?” in Conf. on Fairness, Accountability, and Transparency. ACM, 2021, pp. 610–623

work page 2021

[7] [7]

SPAct: self-supervised privacy preservation for action recognition,

I. R. Dave, C. Chen, and M. Shah, “SPAct: self-supervised privacy preservation for action recognition,” inConf. on Computer Vision and Pattern Recognition. IEEE, 2022, pp. 20 164–20 173

work page 2022

[8] [8]

A user study on anonymization techniques for smart video surveillance,

P. Birnstill, D. Ren, and J. Beyerer, “A user study on anonymization techniques for smart video surveillance,” in 12th Int. Conf. on Advanced Video and Signal Based Surveillance. IEEE, 2015, pp. 1–6

work page 2015

[9] [9]

Learning to anonymize faces for privacy preserving action detection,

Z. Ren, Y . J. Lee, and M. S. Ryoo, “Learning to anonymize faces for privacy preserving action detection,” inEuropean Conf. on Computer Vision, 2018, pp. 620–636

work page 2018

[10] [10]

Real-time video anonymization in smart city intersections,

A. Angus, Z. Duan, G. Zussman, and Z. Kosti ´c, “Real-time video anonymization in smart city intersections,” in 19th Int. Conf. on Mobile Ad Hoc and Smart Systems. IEEE, 2022, pp. 514–522

work page 2022

[11] [11]

EgoBlur: responsible innovation in Aria,

N. Raina, G. Somasundaram, K. Zheng, S. Saarinen, J. Messiner, M. Schwesinger, L. Pesqueira, I. Prasad, E. Miller, P. Gupta et al., “EgoBlur: responsible innovation in Aria,”arXiv preprint arXiv:2308.13093, 2023

work page arXiv 2023

[12] [12]

I know that person: generative full body and face de-identification of people in images,

K. Brkic, I. Sikiric, T. Hrkac, and Z. Kalafatic, “I know that person: generative full body and face de-identification of people in images,” inConf. on Computer Vision and Pattern Recognition Workshops. IEEE, 2017, pp. 1319–1328

work page 2017

[13] [13]

Deepprivacy: a generative adversarial network for face anonymization,

H. Hukkelås, R. Mester, and F. Lindseth, “Deepprivacy: a generative adversarial network for face anonymization,” in Int. Symposium on Visual Computing. Springer, 2019, pp. 565–578

work page 2019

[14] [14]

Deep autoencoders for attribute preserving face de-identification,

P. Nousi, S. Papadopoulos, A. Tefas, and I. Pitas, “Deep autoencoders for attribute preserving face de-identification,” Signal Processing: Image Communication, vol. 81, p. 115699, 2020

work page 2020

[15] [15]

CIAGAN: conditional identity anonymization generative adversarial networks,

M. Maximov, I. Elezi, and L. Leal-Taixé, “CIAGAN: conditional identity anonymization generative adversarial networks,” in Conf. on Computer Vision and Pattern Recognition. IEEE, 2020, pp. 5447–5456

work page 2020

[16] [16]

DeepBlur: a simple and effective method for natural image obfuscation,

T. Li and M. S. Choi, “DeepBlur: a simple and effective method for natural image obfuscation,” arXiv preprint arXiv:2104.02655, vol. 1, p. 3, 2021

work page arXiv 2021

[17] [17]

A3GAN: attribute-aware anonymization networks for face de- identification,

L. Zhai, Q. Guo, X. Xie, L. Ma, Y . E. Wang, and Y . Liu, “A3GAN: attribute-aware anonymization networks for face de- identification,” in30th Int. Conf. on Multimedia. ACM, 2022, pp. 5303–5313

work page 2022

[18] [18]

MeDM: mediating image diffusion models for video-to-video translation with temporal correspondence guidance,

E. Chu, T. Huang, S.-Y . Lin, and J.-C. Chen, “MeDM: mediating image diffusion models for video-to-video translation with temporal correspondence guidance,”arXiv preprint arXiv:2308.10079, 2023

work page arXiv 2023

[19] [19]

Deepprivacy2: towards realistic full-body anonymization,

H. Hukkelås and F. Lindseth, “Deepprivacy2: towards realistic full-body anonymization,” in Conf. on Applications of Com- puter Vision. IEEE, 2023, pp. 1329–1338

work page 2023

[20] [20]

Prime: privacy-preserving video anomaly detection via motion exemplar guidance,

Y . Su, H. Zhu, Y . Tan, S. An, and M. Xing, “Prime: privacy-preserving video anomaly detection via motion exemplar guidance,”Knowledge-Based Systems, vol. 278, p. 110872, 2023

work page 2023

[21] [21]

Privacy-preserving deep action recognition: an adversarial learning frame- work and a new dataset,

Z. Wu, H. Wang, Z. Wang, H. Jin, and Z. Wang, “Privacy-preserving deep action recognition: an adversarial learning frame- work and a new dataset,” IEEE Transactions on Pattern Analysis and Machine Intelligence , vol. 44, no. 4, pp. 2126–2139, 2020

work page 2020

[22] [22]

Disguise without disruption: utility-preserving face de-identification,

Z. Cai, Z. Gao, B. Planche, M. Zheng, T. Chen, M. S. Asif, and Z. Wu, “Disguise without disruption: utility-preserving face de-identification,”arXiv preprint arXiv:2303.13269, 2023

work page arXiv 2023

[23] [23]

Personal privacy protection via irrelevant faces tracking and pixelation in video live streaming,

J. Zhou and C.-M. Pun, “Personal privacy protection via irrelevant faces tracking and pixelation in video live streaming,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 1088–1103, 2020

work page 2020

[24] [24]

Deep crowd anomaly detection: state-of-the-art, challenges, and future research directions,

M. H. Sharif, L. Jiao, and C. W. Omlin, “Deep crowd anomaly detection: state-of-the-art, challenges, and future research directions,”arXiv preprint arXiv:2210.13927, 2022

work page arXiv 2022

[25] [25]

Weakly-supervised video anomaly detection with robust temporal feature magnitude learning,

Y . Tian, G. Pang, Y . Chen, R. Singh, J. W. Verjans, and G. Carneiro, “Weakly-supervised video anomaly detection with robust temporal feature magnitude learning,” inInt. Conf. on Computer Vision. IEEE, 2021, pp. 4975–4986. 15 Low-Latency Video Anonymization for Crowd Anomaly Detection: Privacy vs. Performance

work page 2021

[26] [26]

Learning prompt-enhanced context features for weakly-supervised video anomaly detection,

Y . Pu, X. Wu, L. Yang, and S. Wang, “Learning prompt-enhanced context features for weakly-supervised video anomaly detection,”IEEE Transactions on Image Processing, 2024

work page 2024

[27] [27]

MGFN: magnitude-contrastive glance-and-focus network for weakly-supervised video anomaly detection,

Y . Chen, Z. Liu, B. Zhang, W. Fok, X. Qi, and Y .-C. Wu, “MGFN: magnitude-contrastive glance-and-focus network for weakly-supervised video anomaly detection,” inAAAI Conf. on Artificial Intelligence, vol. 37, no. 1, 2023, pp. 387–395

work page 2023

[28] [28]

CNN-ViT supported weakly-supervised video segment level anomaly detection,

M. H. Sharif, L. Jiao, and C. W. Omlin, “CNN-ViT supported weakly-supervised video segment level anomaly detection,” Sensors, vol. 23, no. 18, p. 7734, 2023

work page 2023

[29] [29]

Deep crowd anomaly detection by fusing reconstruction and prediction networks,

——, “Deep crowd anomaly detection by fusing reconstruction and prediction networks,” Electronics, vol. 12, no. 7, p. 1517, 2023

work page 2023

[30] [30]

Decouple and resolve: transformer-based models for online anomaly detection from weakly labeled videos,

T. Liu, C. Zhang, K.-M. Lam, and J. Kong, “Decouple and resolve: transformer-based models for online anomaly detection from weakly labeled videos,”IEEE Transactions on Information Forensics and Security, vol. 18, pp. 15–28, 2022

work page 2022

[31] [31]

CLIP-TSA: Clip-assisted temporal self-attention for weakly-supervised video anomaly detection,

H. K. Joo, K. V o, K. Yamazaki, and N. Le, “CLIP-TSA: Clip-assisted temporal self-attention for weakly-supervised video anomaly detection,” inIEEE Int. Conf. on Image Processing. IEEE, 2023, pp. 3230–3234

work page 2023

[32] [32]

Learning transferable visual models from natural language supervision,

A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark et al. , “Learning transferable visual models from natural language supervision,” inInt. Conf. on Machine Learning. PMLR, 2021, pp. 8748–8763

work page 2021

[33] [33]

Conceptnet 5.5: an open multilingual graph of general knowledge,

R. Speer, J. Chin, and C. Havasi, “Conceptnet 5.5: an open multilingual graph of general knowledge,” in AAAI Conf. on Artificial Intelligence, vol. 31, no. 1, 2017

work page 2017

[34] [34]

You only look once: unified, real-time object detection,

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: unified, real-time object detection,” in Conf. on Computer Vision and Pattern Recognition. IEEE, 2016, pp. 779–788

work page 2016

[35] [35]

Towards a visual privacy advisor: understanding and predicting privacy risks in images,

T. Orekondy, B. Schiele, and M. Fritz, “Towards a visual privacy advisor: understanding and predicting privacy risks in images,” inInt. Conf. on Computer Vision. IEEE, 2017

work page 2017

[36] [36]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inConf. on Computer Vision and Pattern Recognition. IEEE, 2016, pp. 770–778

work page 2016

[37] [37]

Delving deep into rectifiers: surpassing human-level performance on ImageNet classification,

——, “Delving deep into rectifiers: surpassing human-level performance on ImageNet classification,” in Int. Conf. on Com- puter Vision. IEEE, 2015, pp. 1026–1034

work page 2015

[38] [38]

Real-world anomaly detection in surveillance videos,

W. Sultani, C. Chen, and M. Shah, “Real-world anomaly detection in surveillance videos,” in Conf. on Computer Vision and Pattern Recognition. IEEE, 2018, pp. 6479–6488

work page 2018

[39] [39]

3d convolutional neural networks for human action recognition,

S. Ji, W. Xu, M. Yang, and K. Yu, “3d convolutional neural networks for human action recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 221–231, 2012

work page 2012

[40] [40]

Quo vadis, action recognition? a new model and the kinetics dataset,

J. Carreira and A. Zisserman, “Quo vadis, action recognition? a new model and the kinetics dataset,” in Conf. on Computer Vision and Pattern Recognition. IEEE, 2017, pp. 6299–6308

work page 2017

[41] [41]

The Kinetics Human Action Video Dataset

W. Kay, J. Carreira, K. Simonyan, B. Zhang, C. Hillier, S. Vijayanarasimhan, F. Viola, T. Green, T. Back, P. Natsev et al., “The kinetics human action video dataset,”arXiv preprint arXiv:1705.06950, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[42] [42]

A comprehensive study of deep video action recognition,

Y . Zhu, X. Li, C. Liu, M. Zolfaghari, Y . Xiong, C. Wu, Z. Zhang, J. Tighe, R. Manmatha, and M. Li, “A comprehensive study of deep video action recognition,”arXiv preprint arXiv:2012.06567, 2020

work page arXiv 2012

[43] [43]

A computational approach to edge detection,

J. Canny, “A computational approach to edge detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence , no. 6, pp. 679–698, 1986

work page 1986

[44] [44]

Understanding visual privacy protection: A generalized framework with an instance on facial privacy,

Y . Zhang, J. Ji, W. Wen, Y . Zhu, Z. Xia, and J. Weng, “Understanding visual privacy protection: A generalized framework with an instance on facial privacy,”IEEE Transactions on Information Forensics and Security, 2024

work page 2024

[45] [45]

Scalable person re-identification: a benchmark,

L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang, and Q. Tian, “Scalable person re-identification: a benchmark,” inInt. Conf. on Computer Vision. IEEE, 2015, pp. 1116–1124

work page 2015

[46] [46]

Learning generalisable omni-scale representations for person re- identification,

K. Zhou, Y . Yang, A. Cavallaro, and T. Xiang, “Learning generalisable omni-scale representations for person re- identification,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 9, pp. 5056–5069, 2021. 16

work page 2021