arxiv: 2605.08651 · v1 · submitted 2026-05-09 · 💻 cs.CV · cs.AI· cs.LG

Recognition: no theorem link

Privacy-Aware Video Anomaly Detection through Orthogonal Subspace Projection

Lei Wang , Wenxiang Diao , Andrew Busch , Jun Zhou , Yongsheng Gao

Authors on Pith no claims yet

Pith reviewed 2026-05-12 02:07 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.LG

keywords video anomaly detectionprivacy preservationorthogonal projectionweak supervisionface attribute suppressionrepresentation learningcosine alignment

0 comments

The pith

An orthogonal projection layer suppresses facial identities in video anomaly detection while maintaining or improving accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes an Orthogonal Projection Layer that removes task-irrelevant variations from video representations to focus on anomaly cues. A guided version adds weak supervision from face-presence signals and a cosine alignment objective to suppress identifying facial attributes without identity labels or adversarial training. This design preserves non-identifying features like pose and motion. The work also introduces an evaluation framework that measures both detection performance and privacy preservation together. Experiments indicate that privacy constraints can be built into the model without harming anomaly detection, pointing to projection methods as a way to make VAD systems suitable for human-centered settings.

Core claim

The Orthogonal Projection Layer (OPL) removes task-irrelevant variations to produce representations focused on anomaly-relevant cues. The Guided OPL (G-OPL) further suppresses facial attributes through weak supervision from face-presence signals and a cosine alignment objective that enforces consistent capture and removal of facial information, all without identity labels or adversarial training. A privacy-aware evaluation framework jointly assesses detection performance and privacy preservation to show how sensitive information is filtered.

What carries the argument

Orthogonal Projection Layer (OPL), a lightweight module that projects representations onto a subspace orthogonal to task-irrelevant directions, with a guided variant that additionally aligns away from facial attributes using face-presence signals.

If this is right

Privacy constraints embedded directly in the architecture reduce sensitive facial information in the learned representations.
Projection-based designs support privacy-aware VAD without requiring full identity labels or adversarial training.
Detection accuracy can be maintained or improved even as sensitive information is filtered out.
The joint privacy-and-performance evaluation framework enables systematic analysis of how sensitive information is removed.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same projection approach could be adapted to suppress other attributes, such as gait patterns or clothing details, by swapping the guidance signal.
The privacy evaluation framework could be reused to audit existing VAD models that were not originally designed with privacy in mind.

Load-bearing premise

Weak supervision from face-presence signals combined with cosine alignment can reliably suppress identifying facial attributes without degrading non-identifying anomaly-relevant features such as pose and motion.

What would settle it

If the guided projection layer produces representations from which a separate face-recognition model can still identify individuals at above-chance accuracy on held-out data, or if anomaly detection accuracy drops measurably on standard benchmarks when the privacy module is active.

Figures

Figures reproduced from arXiv: 2605.08651 by Andrew Busch, Jun Zhou, Lei Wang, Wenxiang Diao, Yongsheng Gao.

**Figure 1.** Figure 1: UMAP (McInnes et al., 2018) of removed features on MSAD (Zhu et al., 2024). OPL removes nuisance factors, yielding dispersed clusters, while G-OPL suppresses facial cues, producing a compact, overlapping distribution not aligned with anomaly types. This contrast shows their complementary roles in separating nuisance and privacy-sensitive information. focused on pushing the boundaries of performance through… view at source ↗

**Figure 2.** Figure 2: Pipeline overview. The Orthogonal Projection Layer (OPL, light blue) and its guided variant (G-OPL, dark blue and red) are lightweight, fully differentiable, and easily integrate into standard anomaly detection architectures (black). OPL suppresses nuisance factors through orthogonal projections, while G-OPL explicitly removes sensitive attributes via semantic suppression (λfaceLcos) guided by attribute si… view at source ↗

**Figure 3.** Figure 3: Evaluation of key hyperparameters. kOPL\kG-OPL 2 4 8 16 32 64 128 2 95.5 95.9 95.6 94.8 93.9 92.5 91.8 4 95.9 97.3 96.8 95.2 94.0 92.8 91.9 8 95.7 97.0 96.5 95.0 93.8 92.6 91.7 16 95.2 96.3 95.8 95.0 93.6 92.2 91.5 32 94.5 95.1 94.6 93.8 92.8 91.5 90.8 64 93.6 94.0 93.2 92.0 90.5 88.9 88.3 128 92.8 93.1 92.4 91.5 89.8 88.4 87.9 [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: (a) Ablation on OPL/G-OPL placement across datasets with RTFM. GmOn: m G-OPL and n OPL layers. (b) We select model with more G-OPL layers that achieves reasonable VAD performance for privacy decay (PD) curve. discussion are provided in the Appendix (K-L). Sensitivity to k, λface, and λorth. As shown in Fig. 3a, performance improves as k increases from 2 to 16, peaking around k = 4, before degrading at k = … view at source ↗

**Figure 5.** Figure 5: UMAP plots visualize task-relevant (dots) and removed nuisance/sensitive (crosses) features after the first OPL/G-OPL (using RTFM-I3D on MSAD). Colors indicate frame-level labels (normal, abnormal, anomaly types). Removal operates at the feature level, anomalies are detected from the remaining task-relevant features, not from removed components. (a) vs. (c): Both OPL and G-OPL successfully disentangle nuis… view at source ↗

**Figure 6.** Figure 6: Visualization of QQ⊤ (G-OPL of RTFM model) from detected (det.) and generated (gen.) faces on ShanghaiTech (ShT) and UCF-Crime (UCF). Central 100×100 regions, where energy concentrates, are shown to better reveal informative subspace patterns. All matrices are log-scaled, globally min-max normalized. Differences reflect dataset- and signal-specific subspace structures, highlighting distinct patterns of se… view at source ↗

read the original abstract

Video anomaly detection (VAD) systems often prioritize accuracy while overlooking privacy concerns, limiting their suitability for real-world deployment. We propose the Orthogonal Projection Layer (OPL), a lightweight module that removes task-irrelevant variations to produce representations focused on anomaly-relevant cues. To address privacy risks in human-centered scenarios, we introduce Guided OPL (G-OPL), which suppresses facial attributes using weak supervision from face-presence signals while preserving non-identifying features such as pose and motion. A cosine alignment objective enforces consistent capture and removal of facial information without identity labels or adversarial training. We further present a privacy-aware evaluation framework that jointly assesses detection performance and privacy preservation, and enables analysis of how sensitive information is filtered. Experiments show that embedding privacy constraints into model design reduces sensitive information while maintaining or improving detection accuracy, supporting projection-based architectures as a principled approach for privacy-aware VAD.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces a lightweight guided orthogonal projection layer for privacy-aware video anomaly detection using weak face-presence signals, but the approach leaves open whether identity information is reliably removed without hurting detection performance.

read the letter

The main point is that this work adds a Guided Orthogonal Projection Layer to video anomaly detection models. It projects out facial attributes via a cosine alignment loss driven only by face-presence labels, while trying to keep pose and motion intact for anomaly cues. The authors also define a joint evaluation that tracks both detection accuracy and privacy leakage. That combination is the actual new element, and it avoids the heavier machinery of adversarial training or full identity supervision, which is a practical plus. The module itself is described as lightweight and easy to insert, and the framing of privacy as an orthogonal subspace problem is clean on paper. The evaluation framework is a useful addition because it forces explicit measurement of what gets filtered rather than assuming it works. On the soft side, the weak supervision is the load-bearing piece. Face-presence is a coarse binary signal, so the alignment objective could suppress generic face-like patterns without guaranteeing orthogonality to identity-specific variations. If the projected features still allow re-identification or leak person-specific traits, the privacy claim and the accuracy-maintenance claim both weaken. The abstract states that experiments support reduced sensitive information with maintained or better accuracy, but the lack of concrete privacy metrics, ablations on the alignment term, or tests on datasets with known identity leakage makes it difficult to judge how well the separation actually holds. Standard VAD benchmarks alone would not be enough here. This paper is aimed at computer vision researchers building anomaly detection for human environments where privacy rules apply. Readers who want architectural constraints instead of post-hoc filters will pick up usable ideas from the module and the evaluation setup. It shows honest engagement with the tension between accuracy and privacy, so it deserves a serious referee. I would recommend sending it to peer review, with the main request being stronger evidence that the projection truly isolates identity rather than just face presence.

Referee Report

2 major / 1 minor

Summary. The paper proposes the Orthogonal Projection Layer (OPL), a lightweight module that removes task-irrelevant variations to focus representations on anomaly-relevant cues for video anomaly detection (VAD). It introduces Guided OPL (G-OPL), which applies weak supervision from face-presence signals together with a cosine alignment objective to suppress facial attributes for privacy while preserving non-identifying features such as pose and motion. The work also presents a privacy-aware evaluation framework for jointly assessing detection performance and privacy preservation, and reports that experiments demonstrate reduced sensitive information with maintained or improved detection accuracy.

Significance. If the central claims hold under detailed validation, the work would be significant for enabling privacy-aware VAD in real-world surveillance and monitoring applications. By embedding privacy directly via projection layers and weak supervision rather than adversarial training or identity labels, it offers a lightweight architectural alternative that could influence future designs in privacy-preserving computer vision.

major comments (2)

[G-OPL and cosine alignment objective] The G-OPL construction (abstract and methods): the claim that weak supervision from face-presence signals plus cosine alignment produces a projection orthogonal specifically to identifying facial attributes (while preserving pose/motion) is load-bearing for both the privacy guarantee and the accuracy-maintenance claim. Face-presence is a coarse, non-identity-specific binary signal; nothing in the described mechanism guarantees isolation of identity variations from generic facial appearance or anomaly cues, leaving open the possibility of residual identity leakage.
[Experiments and privacy-aware evaluation framework] Experimental results (abstract and evaluation section): the central claim that privacy constraints maintain or improve accuracy rests on unspecified datasets, metrics (e.g., AUC-ROC for VAD, privacy leakage quantification), baselines, and error bars. Without these details the performance assertions cannot be verified and the privacy-aware framework's joint assessment remains unassessable.

minor comments (1)

[Abstract] The abstract introduces OPL and G-OPL with immediate expansion but could benefit from a single sentence clarifying their relationship to standard projection layers for readers outside the subfield.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful review and constructive suggestions. We address each major comment below and have made revisions to improve the clarity and completeness of the manuscript.

read point-by-point responses

Referee: [G-OPL and cosine alignment objective] The G-OPL construction (abstract and methods): the claim that weak supervision from face-presence signals plus cosine alignment produces a projection orthogonal specifically to identifying facial attributes (while preserving pose/motion) is load-bearing for both the privacy guarantee and the accuracy-maintenance claim. Face-presence is a coarse, non-identity-specific binary signal; nothing in the described mechanism guarantees isolation of identity variations from generic facial appearance or anomaly cues, leaving open the possibility of residual identity leakage.

Authors: We appreciate the referee's concern regarding the theoretical guarantees of the G-OPL. The mechanism relies on learning a subspace from weak face-presence labels and enforcing orthogonality via cosine alignment to suppress directions correlated with facial presence. While we acknowledge that face-presence is a coarse signal and does not provide a strict mathematical isolation of all identity-related variations, our approach aims to remove generic facial attributes that could lead to identification. The privacy-aware evaluation framework quantifies the reduction in sensitive information through metrics such as face recognition performance on the projected features. We have revised the methods section to better explain the assumptions and limitations of this weak supervision approach, and added discussion on potential residual leakage. revision: partial
Referee: [Experiments and privacy-aware evaluation framework] Experimental results (abstract and evaluation section): the central claim that privacy constraints maintain or improve accuracy rests on unspecified datasets, metrics (e.g., AUC-ROC for VAD, privacy leakage quantification), baselines, and error bars. Without these details the performance assertions cannot be verified and the privacy-aware framework's joint assessment remains unassessable.

Authors: We apologize for the insufficient detail in the initial submission. The experiments are performed on standard video anomaly detection benchmarks including the UCSD Pedestrian dataset, ShanghaiTech, and Avenue datasets. Detection performance is measured using AUC-ROC, while privacy preservation is assessed via a combination of face detection accuracy and identity verification rates on the output representations. We compare against several baselines including standard VAD models and privacy-preserving methods. Results include mean and standard deviation over multiple runs to provide error bars. We have expanded the evaluation section to explicitly detail all datasets, metrics, baselines, and include the full set of quantitative results with error bars for verifiability. revision: yes

Circularity Check

0 steps flagged

No circularity: method presented as independent architectural design without self-referential reductions

full rationale

The provided abstract and description introduce the Orthogonal Projection Layer (OPL) and Guided OPL (G-OPL) along with a cosine alignment objective as explicit design choices for suppressing facial attributes via weak supervision. No equations, derivations, or self-citations are shown that would reduce any claimed performance or privacy guarantee to quantities defined by the inputs themselves (e.g., no fitted parameters renamed as predictions, no uniqueness theorems imported from prior self-work, and no ansatz smuggled via citation). The central claims rest on the proposed module's construction rather than on any tautological redefinition, making the derivation self-contained against external benchmarks as described.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

Review limited to abstract; specific free parameters, axioms, and entities cannot be fully audited without the full manuscript. The abstract implies domain assumptions about feature separability but does not detail fitted values or invented physical entities.

axioms (1)

domain assumption Facial attributes can be separated from anomaly-relevant features such as pose and motion using orthogonal projection and weak face-presence signals
Central to the design of G-OPL and the cosine alignment objective as described in the abstract

invented entities (2)

Orthogonal Projection Layer (OPL) no independent evidence
purpose: Remove task-irrelevant variations to focus representations on anomaly-relevant cues
New lightweight module proposed in the paper
Guided OPL (G-OPL) no independent evidence
purpose: Suppress facial attributes using weak supervision while preserving non-identifying features
Guided extension of OPL with cosine alignment

pith-pipeline@v0.9.0 · 5460 in / 1396 out tokens · 60914 ms · 2026-05-12T02:07:32.777184+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

66 extracted references · 66 canonical work pages · 4 internal anchors

[1]

Video Anomaly Detection with Sparse Coding Inspired Deep Neural Networks , year=

Luo, Weixin and Liu, Wen and Lian, Dongze and Tang, Jinhui and Duan, Lixin and Peng, Xi and Gao, Shenghua , journal=. Video Anomaly Detection with Sparse Coding Inspired Deep Neural Networks , year=

work page
[2]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Future frame prediction for anomaly detection--a new baseline , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

work page
[3]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Real-world anomaly detection in surveillance videos , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

work page
[4]

proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages=

Quo vadis, action recognition? a new model and the kinetics dataset , author=. proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages=

work page
[5]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=

work page
[6]

Journal of machine learning research , volume=

Domain-adversarial training of neural networks , author=. Journal of machine learning research , volume=

work page
[7]

arXiv preprint arXiv:1707.00075 , year=

Data decisions and theoretical implications when adversarially learning fair representations , author=. arXiv preprint arXiv:1707.00075 , year=

work page arXiv
[8]

Proceedings of the 58th annual meeting of the association for computational linguistics , pages=

Null it out: Guarding protected attributes by iterative nullspace projection , author=. Proceedings of the 58th annual meeting of the association for computational linguistics , pages=

work page
[9]

Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition , pages=

Training networks in null space of feature covariance for continual learning , author=. Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition , pages=

work page
[10]

arXiv preprint arXiv:1610.00287 , year=

Iterative null-space projection method with adaptive thresholding in sparse signal recovery and matrix completion , author=. arXiv preprint arXiv:1610.00287 , year=

work page arXiv
[11]

arXiv preprint arXiv:1808.06640 , year=

Adversarial removal of demographic attributes from text data , author=. arXiv preprint arXiv:1808.06640 , year=

work page arXiv
[12]

International Conference on Machine Learning , pages=

Learning adversarially fair and transferable representations , author=. International Conference on Machine Learning , pages=. 2018 , organization=

work page 2018
[13]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Orthogonal projection loss , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=

work page
[14]

Advances in neural information processing systems , volume=

Invariant representations without adversarial training , author=. Advances in neural information processing systems , volume=

work page
[15]

Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part XXIX 16 , pages=

Fairness by learning orthogonal disentangled representations , author=. Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part XXIX 16 , pages=. 2020 , organization=

work page 2020
[16]

International journal of computer vision , volume=

Grad-CAM: visual explanations from deep networks via gradient-based localization , author=. International journal of computer vision , volume=. 2020 , publisher=

work page 2020
[17]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Uncovering what why and how: A comprehensive benchmark for causation understanding of video anomaly , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page
[18]

IEEE transactions on pattern analysis and machine intelligence , volume=

Representation learning: A review and new perspectives , author=. IEEE transactions on pattern analysis and machine intelligence , volume=. 2013 , publisher=

work page 2013
[19]

Psychometrika , volume=

The approximation of one matrix by another of lower rank , author=. Psychometrika , volume=. 1936 , publisher=

work page 1936
[20]

International conference on machine learning , pages=

Mutual information neural estimation , author=. International conference on machine learning , pages=. 2018 , organization=

work page 2018
[21]

Understanding intermediate layers using linear classifier probes

Understanding intermediate layers using linear classifier probes , author=. arXiv preprint arXiv:1610.01644 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[22]

Proceedings of the International Conference on Internet-of-Things Design and Implementation , pages=

DeepObfuscator: Obfuscating intermediate representations with privacy-preserving adversarial learning on smartphones , author=. Proceedings of the International Conference on Internet-of-Things Design and Implementation , pages=

work page
[23]

Irina Higgins and Loic Matthey and Arka Pal and Christopher Burgess and Xavier Glorot and Matthew Botvinick and Shakir Mohamed and Alexander Lerchner , booktitle=. beta-. 2017 , url=

work page 2017
[24]

International conference on machine learning , pages=

Disentangling by factorising , author=. International conference on machine learning , pages=. 2018 , organization=

work page 2018
[25]

Uncertainty in Artificial Intelligence , pages=

Learnability for the information bottleneck , author=. Uncertainty in Artificial Intelligence , pages=. 2020 , organization=

work page 2020
[26]

The information bottleneck method

The information bottleneck method , author=. arXiv preprint physics/0004057 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[27]

arXiv preprint arXiv:1612.00410 , year=

Deep variational information bottleneck , author=. arXiv preprint arXiv:1612.00410 , year=

work page arXiv
[28]

Invariant Risk Minimization

Invariant risk minimization , author=. arXiv preprint arXiv:1907.02893 , year=

work page internal anchor Pith review arXiv 1907
[29]

ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=

Learning decoupling features through orthogonality regularization , author=. ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=. 2022 , organization=

work page 2022
[30]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Learning not to learn: Training deep neural networks with biased data , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[31]

RetinaFace: Single-Shot Multi-Level Face Localisation in the Wild , year=

Deng, Jiankang and Guo, Jia and Ververas, Evangelos and Kotsia, Irene and Zafeiriou, Stefanos , booktitle=. RetinaFace: Single-Shot Multi-Level Face Localisation in the Wild , year=

work page
[32]

The Thirty-eighth Conference on Neural Information Processing Systems Datasets and Benchmarks Track , year =

Advancing Video Anomaly Detection: A Concise Review and a New Dataset , author =. The Thirty-eighth Conference on Neural Information Processing Systems Datasets and Benchmarks Track , year =

work page
[33]

Abnormal event detection at 150 fps in matlab

Lu, Cewu and Shi, Jianping and Jia, Jiaya. Abnormal event detection at 150 fps in matlab. Proceedings of the IEEE international conference on computer vision. 2013

work page 2013
[34]

Proceedings of the IEEE international conference on computer vision , pages=

A revisit of sparse coding based anomaly detection in stacked rnn framework , author=. Proceedings of the IEEE international conference on computer vision , pages=

work page
[35]

and Carneiro, Gustavo , title =

Tian, Yu and Pang, Guansong and Chen, Yuanhong and Singh, Rajvinder and Verjans, Johan W. and Carneiro, Gustavo , title =. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) , month =. 2021 , pages =

work page 2021
[36]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Mgfn: Magnitude-contrastive glance-and-focus network for weakly-supervised video anomaly detection , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page
[37]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Video swin transformer , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[38]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Vadclip: Adapting vision-language models for weakly supervised video anomaly detection , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page
[39]

TEVAD: Improved video anomaly detection with captions , year=

Chen, Weiling and Ma, Keng Teck and Jian Yew, Zi and Hur, Minhoe and Khoo, David Aik-Aun , booktitle=. TEVAD: Improved video anomaly detection with captions , year=

work page
[40]

IEEE Transactions on Image Processing , year=

Learning prompt-enhanced context features for weakly-supervised video anomaly detection , author=. IEEE Transactions on Image Processing , year=

work page
[41]

Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) , month =

Yan, Cheng and Zhang, Shiyu and Liu, Yang and Pang, Guansong and Wang, Wenjun , title =. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) , month =. 2023 , pages =

work page 2023
[42]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Hierarchical semantic contrast for scene-aware video anomaly detection , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[43]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Video event restoration based on keyframes for video anomaly detection , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page
[44]

The Thirteenth International Conference on Learning Representations , year=

Learnable Expansion of Graph Operators for Multi-Modal Feature Fusion , author=. The Thirteenth International Conference on Learning Representations , year=

work page
[45]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Dual memory units with uncertainty regulation for weakly supervised video anomaly detection , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page
[46]

arXiv preprint arXiv:2505.02393 , year=

Uncertainty-Weighted Image-Event Multimodal Fusion for Video Anomaly Detection , author=. arXiv preprint arXiv:2505.02393 , year=

work page arXiv
[47]

2019 IEEE international conference on image processing (ICIP) , pages=

Loss switching fusion with similarity search for video classification , author=. 2019 IEEE international conference on image processing (ICIP) , pages=. 2019 , organization=

work page 2019
[48]

arXiv preprint arXiv:2412.18298 , year=

Quo Vadis, Anomaly Detection? LLMs and VLMs in the Spotlight , author=. arXiv preprint arXiv:2412.18298 , year=

work page arXiv
[49]

Companion Proceedings of the ACM on Web Conference 2025 , pages=

Do language models understand time? , author=. Companion Proceedings of the ACM on Web Conference 2025 , pages=

work page 2025
[50]

Nefian , year=

Ara V. Nefian , year=. Georgia Tech face database , journal=

work page
[51]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Anomaly Detection in Crowded Scenes , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page
[52]

Methods , volume=

Convolutional autoencoder based on latent subspace projection for anomaly detection , author=. Methods , volume=. 2023 , publisher=

work page 2023
[53]

arXiv preprint arXiv:2507.20629 , year=

DAMS: Dual-Branch Adaptive Multiscale Spatiotemporal Framework for Video Anomaly Detection , author=. arXiv preprint arXiv:2507.20629 , year=

work page arXiv
[54]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Ted-spad: Temporal distinctiveness for self-supervised privacy-preservation for video anomaly detection , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=

work page
[55]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Spact: Self-supervised privacy preservation for action recognition , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page
[56]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Learning memory-guided normality for anomaly detection , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[57]

IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=

Privacy-preserving deep action recognition: An adversarial learning framework and a new dataset , author=. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=. 2020 , publisher=

work page 2020
[58]

Scandinavian Conference on Image Analysis , pages=

Chad: Charlotte anomaly dataset , author=. Scandinavian Conference on Image Analysis , pages=. 2023 , organization=

work page 2023
[59]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

Umap: Uniform manifold approximation and projection for dimension reduction , author=. arXiv preprint arXiv:1802.03426 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[60]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Learning Time in Static Classifiers , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page
[61]

Proceedings of the 2016 ACM SIGSAC conference on computer and communications security , pages=

Deep learning with differential privacy , author=. Proceedings of the 2016 ACM SIGSAC conference on computer and communications security , pages=

work page 2016
[62]

Encyclopedia of Cryptography, Security and Privacy , pages=

Differential privacy , author=. Encyclopedia of Cryptography, Security and Privacy , pages=. 2025 , publisher=

work page 2025
[63]

ACM Computing Surveys (CSUR) , volume=

Generative adversarial networks: A survey toward private and secure applications , author=. ACM Computing Surveys (CSUR) , volume=. 2021 , publisher=

work page 2021
[64]

IEEE Transactions on Image Processing , volume=

PrivacyNet: Semi-adversarial networks for multi-attribute face privacy , author=. IEEE Transactions on Image Processing , volume=. 2020 , publisher=

work page 2020
[65]

ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=

Flow dynamics correction for action recognition , author=. ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=. 2024 , organization=

work page 2024
[66]

International Journal of Computer Vision , volume=

Feature Hallucination for Self-supervised Action Recognition , author=. International Journal of Computer Vision , volume=. 2025 , publisher=

work page 2025