{"total":114,"items":[{"citing_arxiv_id":"2606.25239","ref_index":28,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Tensor-Based Batch Fuzzing with Adaptive Perturbation Scaling for Deep Neural Networks","primary_cat":"cs.SE","submitted_at":"2026-06-23T23:56:21+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"A tensor-based batch fuzzing framework with adaptive perturbation scaling from specification ranges achieves up to 40X higher throughput and 4X more detected violations than sequential baselines on DNN benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.25045","ref_index":78,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Machine Learning Approaches for Improved Scalability of Metallic Magnetic Calorimeters","primary_cat":"physics.ins-det","submitted_at":"2026-06-23T18:02:25+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"Machine learning methods are explored for pulse classification, artifact rejection, and shape analysis in metallic magnetic calorimeters to improve scalability over traditional signal processing.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.00828","ref_index":21,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"RoboStressBench: Benchmarking VLM Robustness to Physical Visual Stress in Embodied Scenes","primary_cat":"cs.CV","submitted_at":"2026-05-30T17:55:10+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"RoboStressBench decomposes visual stress into four physically grounded dimensions to benchmark VLM robustness in embodied scenes and proposes a stress-aware solver.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.00738","ref_index":4,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"SORA: Free Second-Order Attacks in Fast Adversarial Training","primary_cat":"cs.LG","submitted_at":"2026-05-30T14:10:32+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"SORA is an adaptive step-size adversarial training algorithm that formalizes epsilon overfitting, introduces the PertAlign metric to predict catastrophic overfitting, and dynamically adjusts perturbations to achieve state-of-the-art robustness and clean accuracy with fixed hyperparameters.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.30531","ref_index":46,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Benchmarking Bilevel Derivative-Free Optimization Algorithms","primary_cat":"math.OC","submitted_at":"2026-05-28T20:08:13+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Introduces a refereeing procedure and full computational cost accounting to improve benchmarking fairness for bilevel derivative-free optimization algorithms.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.27836","ref_index":8,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Symmetry Defeats Auditing","primary_cat":"cs.CR","submitted_at":"2026-05-27T01:47:18+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"Symmetry enables an attack that defeats introspection adapters for auditing AI systems.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.22644","ref_index":101,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Why SGD is not Brownian Motion: A New Perspective on Stochastic Dynamics","primary_cat":"cs.LG","submitted_at":"2026-05-21T15:50:40+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"SGD is reformulated via a master equation from discrete updates, producing a discrete Fokker-Planck equation that predicts non-stationary variance growth proportional to learning rate in flat Hessian directions.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.20519","ref_index":46,"ref_count":2,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Codec-Robust Attacks on Audio LLMs","primary_cat":"cs.SD","submitted_at":"2026-05-19T21:39:52+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"CodecAttack perturbs audio in codec latent space with multi-bitrate EoT to achieve 85.5% average ASR on Opus-compressed Audio LLMs versus under 26% for waveform baselines, with transfer to MP3 and AAC.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.19678","ref_index":44,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"RoVLA: Multi-Consistency Constraints for Robust Vision-Language-Action Models","primary_cat":"cs.RO","submitted_at":"2026-05-19T11:10:20+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"RoVLA enforces instructional, evolutionary, and observational consistency to improve robustness of VLA policies on manipulation benchmarks and real robots.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.19392","ref_index":176,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Understanding Dynamics of Adam in Zero-Sum Games: An ODE Approach","primary_cat":"cs.LG","submitted_at":"2026-05-19T05:38:47+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Derives ODE limits of Adam-DA showing that first- and second-order momentum parameters reverse their convergence roles in zero-sum games compared to minimization, validated on GAN experiments.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.19035","ref_index":38,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Trustworthy Agent Network: Trust in Agent Networks Must Be Baked In, Not Bolted On","primary_cat":"cs.AI","submitted_at":"2026-05-18T18:57:54+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"Argues that trustworthiness in Agent-to-Agent networks requires a new conceptual framework with four design pillars baked in from the beginning, as retrofitting existing single-agent methods is insufficient.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.19032","ref_index":39,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Personalized Face Privacy Protection From a Single Image","primary_cat":"cs.CV","submitted_at":"2026-05-18T18:56:56+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"FaceCloak learns a lightweight identity-specific cloaking mask from a single image via synthetic face generation and iterative embedding perturbation to evade multiple recognition models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.18666","ref_index":3,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"A No-Defense Defense Against Gradient-Based Adversarial Attacks on ML-NIDS: Is Less More?","primary_cat":"cs.LG","submitted_at":"2026-05-18T17:10:08+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"Experiments with around 2200 variations show that shallower networks with reduced features and ReLU activation reduce adversarial vulnerability in ML-NIDS and outperform deeper adversarially trained models while keeping high clean-data performance.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.18058","ref_index":17,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Threats to Arabic Handwriting Recognition: Investigating Black-Box Adversarial Attacks on embedded ConvNet models","primary_cat":"cs.CV","submitted_at":"2026-05-18T08:45:16+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Black-box attacks, especially Pixle, reach 99-100% success on Arabic handwriting ConvNet models across two benchmark datasets while preserving character structure.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.17310","ref_index":26,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Attention Hijacking: Response Manipulation Across Queries in Vision-Language Models","primary_cat":"cs.CV","submitted_at":"2026-05-17T08:02:27+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Attention Hijacking is a new attack that improves cross-query transferability in VLMs by explicitly steering internal attention to a persistent image-dominant pattern.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.17153","ref_index":18,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Stress-Testing Neural Network Verifiers with Provably Robust Instances","primary_cat":"cs.LG","submitted_at":"2026-05-16T20:56:52+00:00","verdict":"CONDITIONAL","verdict_confidence":"MODERATE","novelty_score":7.0,"formal_verification":"none","one_line_summary":"A reusable framework generates verification instances with provably known robustness labels, revealing numeric tolerance issues and bugs in five verifiers while introducing difficulty profiles to diagnose failure modes.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.16905","ref_index":41,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"AIM: Adversarial Information Masking for Faithfulness Evaluation of Saliency Maps","primary_cat":"cs.LG","submitted_at":"2026-05-16T09:36:58+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"AIM is a new saliency-guided adversarial feature replacement method to evaluate faithfulness of saliency maps and reliability of masking operators on image, audio, and EEG tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.16720","ref_index":2,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Compositional Adversarial Training for Robust Visual Watermarking","primary_cat":"cs.CV","submitted_at":"2026-05-16T00:07:49+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"CAT trains watermark detectors against adaptive compositional adversaries using differentiable attack selection, yielding up to 63.5% capacity gains on hard attacks versus random-augmentation baselines.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.16651","ref_index":7,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Right Predictions, Misleading Explanations: On the Vulnerability of Vision-Language Model Explanations","primary_cat":"cs.CV","submitted_at":"2026-05-15T21:44:16+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.18868","ref_index":40,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"DarkLLM: Learning Language-Driven Adversarial Attacks with Large Language Models","primary_cat":"cs.CR","submitted_at":"2026-05-15T12:28:16+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"DarkLLM trains an LLM to generate language-driven adversarial perturbations that unify targeted, untargeted, segmentation, and multi-model attacks on foundation models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.15416","ref_index":60,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Margin-Adaptive Confidence Ranking for Reliable LLM Judgement","primary_cat":"cs.LG","submitted_at":"2026-05-14T21:01:05+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.15249","ref_index":23,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Enabling Adversarial Robustness in AI Models through Kubeflow MLOps","primary_cat":"cs.CR","submitted_at":"2026-05-14T12:45:36+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"A Kubeflow-based MLOps architecture detects FGSM adversarial attacks on deployed AI models and automatically applies PGD-based adversarial training to recover accuracy.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.12937","ref_index":66,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"AuraMask: An Extensible Pipeline for Developing Aesthetic Anti-Facial Recognition Image Filters","primary_cat":"cs.CV","submitted_at":"2026-05-13T03:16:12+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"AuraMask produces 40 aesthetic anti-facial recognition filters that match or exceed prior adversarial effectiveness and achieve significantly higher user acceptance in a 630-person study.","context_count":1,"top_context_role":"method","top_context_polarity":"use_method","context_text":"With the two objectives (L𝐹 𝐸𝐴𝑇 andL 𝐴𝐸𝑆 ) we thus seek to optimize the defense (𝑔) such that: ∀𝑥∈X arg min 𝑥 L𝐹 𝐸𝐴𝑇 (𝑥, 𝑔(𝑥)) + L 𝐴𝐸𝑆 (𝑥, 𝑔(𝑥))(6) 3.4 Technical Infrastructure and Definitions Many prior AML-based obfuscations use iterative methods to generate effective outputs through multiple forward and backward passes over target models (e.g., Projected Gradient Descent (PGD) [66] or the Fast Gradient Sign Method 8 Lagogiannis et al. (FGSM) [3]). We designed theAuraMaskpipeline to instead take advantage of the Adversarial Transformation Network (ATN) [8, 49]. Unlike FGSM and PGD, the ATN learns to predict an effective adversarial perturbation. As such, it can apply a defense with only a single forward pass and does not need to access the targeted model(s) when applying this"},{"citing_arxiv_id":"2605.12813","ref_index":155,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"REALISTA: Realistic Latent Adversarial Attacks that Elicit LLM Hallucinations","primary_cat":"cs.CL","submitted_at":"2026-05-12T23:13:50+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.12792","ref_index":63,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"SoK: A Comprehensive Analysis of the Current Status of Neural Tangent Generalization Attacks with Research Directions","primary_cat":"cs.LG","submitted_at":"2026-05-12T22:10:01+00:00","verdict":"ACCEPT","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"NTGA is the first clean-label generalization attack under black-box settings but is vulnerable to adversarial training and image transformations, with newer attacks outperforming it.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Attackers add imperceptible perturbation to the test data so that pre-trained models misclassify them with high confidentiality during test time [79][36]. Adversarial attacks are also known asevasive attacks[ 60]. Popular adversarial attacks include Projected Gradient Descent (PGD) attack [61], Fast Gradient Sign Method (FGSM) attack [41], DeepFool attack [63] and Carlini and Wagner's attack (C&W) [10]. By definition, an adversarial attack is a mapping 𝛼 : 𝑅𝑛 →𝑅 𝑛 such that adversarial example 𝛼(𝑥)=𝑥 ′ is misclassified as to a class other than its original class𝑦by the model𝑓. The difference between 𝑥 ′ and 𝑥 is trivial, i.e., ∥𝑥−𝑥 ′ ∥𝑝 ≤𝜖 for some small value 𝜖 [51]. For a better understanding of adversarial attacks, as"},{"citing_arxiv_id":"2605.12431","ref_index":37,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"GaitProtector: Impersonation-Driven Gait De-Identification via Training-Free Diffusion Latent Optimization","primary_cat":"cs.CV","submitted_at":"2026-05-12T17:27:56+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"GaitProtector optimizes diffusion model latents to impersonate target identities in gait sequences, dropping Rank-1 identification accuracy from 89.6% to 15.0% on CASIA-B while keeping scoliosis diagnostic accuracy at 74.2%.","context_count":1,"top_context_role":"method","top_context_polarity":"use_method","context_text":"oriented non-diffusion baseline and three variants of our framework. Specifically, we implement a contour-basedpro- jected gradient descent (PGD)baseline method that ex- plicitly edits silhouette geometry (rather than pixel-level noise) to remain effective under hard re-binarization. Inspired by momentum-based projected adversarial optimization [7], [37], it performs projected gradient updates on contour- localized degrees of freedom under anℓ ∞ constraint and keeps the output strictly binary after each iteration. We further reportOurs (VAE only), which optimizes only in the pretrained V AE latent space without 3D U-Net denoising; Ours (Obfuscation only), which removes the target-anchored term by settingL imp = 0; andOurs (Full), i."},{"citing_arxiv_id":"2605.18821","ref_index":62,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Quantum Adversarial Machine Learning: From Classical Adaptations to Quantum-Native Methods","primary_cat":"cs.LG","submitted_at":"2026-05-12T14:41:20+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":1.0,"formal_verification":"none","one_line_summary":"A survey of quantum adversarial machine learning covering attacks, countermeasures, theoretical underpinnings, trends, and challenges.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"QuanTest: Entanglement-guided testing of quantum neural network systems (2024) [60] X. Liu, L. Xie, Y. Wang, J. Zou, J. Xiong, Z. Ying, A. Vasilakos. Privacy and security issues in deep learning: a survey. ieee access 9: 4566-4593 (2021) [61] I.J. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014) 48 [62] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, A. Vladu, Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017) [63] W. El Maouaki, A. Marchisio, T. Said, M. Shafique, M. Bennai. RobQuNNs: A methodology for robust quanvolutional neural networks against adversarial attacks (2024) [64] M.T. West, S.M. Erfani, C."},{"citing_arxiv_id":"2605.12195","ref_index":6,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Fair Conformal Classification via Learning Representation-Based Groups","primary_cat":"cs.LG","submitted_at":"2026-05-12T14:37:17+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"A fair conformal classification method guarantees conditional coverage on adaptively identified subgroups defined via learned representations.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Then, we performTsampling of the vectors t (t∈[T])from the joint Bernoulli dis- tributionBlearned by models in Eq. 3. Eachs t defines a group ˆGst, and such group is used as a calibration set to build a prediction setC m(XN+1 , ˆGst)as mentioned in Section 2. The final predic- tion set forY N+1 is given by the union of all these sets: C(X N+1)=C m(XN+1 ,D)∪ T ⋃ t=1 Cm(XN+1 , ˆGst).(6) Our approach FAREG is summarized in Algorithm 1. To analyze its time complexity, assume we haveMtest instances and the complexity of conducting classic conformal prediction isO(N+M). Then, training the model to select groups isO(EN(∣θ∣+∣ϕ∣+∣φ∣)), whereEis the number of epochs. For allMtest instances, the time of selecting groups and constructing prediction sets is"},{"citing_arxiv_id":"2605.11636","ref_index":34,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Seir\\^enes: Adversarial Self-Play with Evolving Distractions for LLM Reasoning","primary_cat":"cs.AI","submitted_at":"2026-05-12T06:58:35+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Seirênes trains LLMs via adversarial self-play to generate and overcome evolving distractions, producing gains of 7-10 points on math reasoning benchmarks and exposing blind spots in larger models.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"training through localized interventions or guidance from stronger external traces [31-33]. In stark contrast to these methods, our framework is fundamentallyadversarialandco-evolutionary. A shared-parameter model competes against itself-simultaneously playing theAdversaryto synthesize adversarial hints, and theReasoner to overcome them. This attacker-target structure also connects to classical adversarial training [34, 35] and LLM red-teaming [36-38]. These areas primarily target robustness evaluation, attack defense, or safety tuning; we discuss the connection further in Appendix A. 3 Preliminaries We build on Group Relative Policy Optimization (GRPO) [4], a critic-free policy gradient algorithm that replaces the learned value baseline with group-relative reward normalization."},{"citing_arxiv_id":"2605.10582","ref_index":40,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Guaranteed Jailbreaking Defense via Disrupt-and-Rectify Smoothing","primary_cat":"cs.CR","submitted_at":"2026-05-11T13:54:26+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"DR-Smoothing introduces a disrupt-then-rectify prompt processing scheme into smoothing defenses, delivering tight theoretical bounds on success probability against both token- and prompt-level jailbreaks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.10183","ref_index":4,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Fix the Loss, Not the Radius: Rethinking the Adversarial Perturbation of Sharpness-Aware Minimization","primary_cat":"cs.LG","submitted_at":"2026-05-11T08:34:36+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"LE-SAM inverts SAM by fixing the loss budget instead of the parameter-space radius, yielding better generalization across benchmarks.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"surrogate is often primarily shaped by the first-order gra- dient signal. In contrast, flat minima are fundamentally a second-order concept. The next section formalizes this mechanism-level mismatch. 3.3. Mismatch with Flat Minima For a local minima at w∗, consider the second-order Taylor expansion takes the form: L(w∗+ϵ) =L(w ∗)+∇L(w∗)⊤ϵ| {z } ≈0 + 1 2 ϵ⊤H(w ∗)ϵ+O(∥ϵ∥3), (4) Here, H(w ∗) is the Hessian. Near the local minima, the gradient is approximately 0 (∇L(w∗)≈0 ), thus the leading term that governs how fast the loss increases aroundw ∗ is: L(w∗ +ϵ)−L(w ∗)≈ 1 2 ϵ⊤H(w ∗)ϵ(5) Thus, flat minima is inherently a second-order concept:the flatness of a minima is often characterized by small Hessian eigenvalues. Based on this, many approaches adopt the"},{"citing_arxiv_id":"2605.09716","ref_index":181,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Medical Model Synthesis Architectures: A Case Study","primary_cat":"cs.AI","submitted_at":"2026-05-10T19:30:16+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"MedMSA framework retrieves knowledge via language models then builds formal probabilistic models to produce uncertainty-weighted differential diagnoses from symptoms.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.09646","ref_index":63,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"\"Training robust watermarking model may hurt authentication!'' Exploring and Mitigating the Identity Leakage in Robust Watermarking","primary_cat":"cs.CR","submitted_at":"2026-05-10T16:44:50+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"W-IR is the first watermarking framework to combine certified robustness via randomized smoothing in pixel and coordinate spaces with identity leakage mitigation via residual information loss minimization.","context_count":1,"top_context_role":"method","top_context_polarity":"use_method","context_text":"These attacks can generate numerous adversarial examples by setting various transformation parameters at a low cost, without the knowledge of watermarking neural networks [33]. Another approach involves employing finely tuned opti- mization algorithms to search for adversarial perturbations. For instance, Jiang et al. [17] propose to use PGD [62] and HopSkipJump [63] to search for adversarial examples. An et al. [33] and Hu et al. [64] trained surrogate models simu- lating a watermark decoder to determine the presence of a watermark in an image. However, they typically involve al- gorithms of higher complexity, requiring additional time for model training and adversarial examples generation [65]. Reconstruction Attacksaim to recreate images without"},{"citing_arxiv_id":"2605.09606","ref_index":22,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"On the Generation and Mitigation of Harmful Geometry in Image-to-3D Models","primary_cat":"cs.CR","submitted_at":"2026-05-10T15:35:42+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":8.0,"formal_verification":"none","one_line_summary":"Image-to-3D models successfully generate harmful geometries in most cases with under 0.3% caught by commercial filters; existing safeguards are weak but a stacked defense cuts harmful outputs to under 1% at 11% false-positive cost.","context_count":1,"top_context_role":"method","top_context_polarity":"use_method","context_text":"ial strategies to understand the resilience of the adapted input filter against adaptive adversaries. First, we utilize degra- 10 dation (Section 6.3) and camouflaging (Section 6.5) datasets constructed to simulate low-quality inputs and semantic ob- fuscation. Second, we conduct black-box adversarial attacks using standard optimization-based methods, i.e., PGD [22], C&W [2], and I-FGSM [12], to probe the vulnerability (at- tack budgetε=8/255). Full evaluation results are presented in Figure 9, with implementation details in Section E.1. Semantic camouflaging is effective at evading detection for visual threats, but has limited impact on structural com- ponents and kinetic weapons. This is because re-imagining"},{"citing_arxiv_id":"2605.08910","ref_index":18,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Enhancing Adversarial Robustness in Network Intrusion Detection: A Layer-wise Adaptive Regularization Approach","primary_cat":"cs.CR","submitted_at":"2026-05-09T12:10:01+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"LARAR enhances adversarial robustness in network intrusion detection by using layer-wise adaptive regularization and auxiliary classifiers, achieving 95.01% clean accuracy and improved defense against FGSM, PGD, and transfer attacks on UNSW-NB15.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Adversarial attacks on machine learning models have been extensively studied since their discovery by Szegedy et al. [16], who demonstrated that imperceptible pertur- bations could cause misclassification in deep neural networks. Goodfellow et al. [17] introduced the FGSM, a computationally efficient single-step attack that exploits the gradient of the loss function. Building upon this work, Madry et al. [18] proposed PGD, an iterative variant that represents one of the strongest first-order adversarial attacks. In the domain of network intrusion detection, adversarial attacks pose unique chal- lenges compared to computer vision applications. Rigaki and Garcia [19] demonstrated that gradient-based attacks could evade deep learning-based intrusion detection sys-"},{"citing_arxiv_id":"2605.08850","ref_index":77,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Local LMO: Constrained Gradient Optimization via a Local Linear Minimization Oracle","primary_cat":"math.OC","submitted_at":"2026-05-09T10:03:24+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":8.0,"formal_verification":"none","one_line_summary":"Local LMO is a new projection-free method that achieves the convergence rates of projected gradient descent for constrained optimization by using local linear minimization oracles over small balls.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"∥xk −x ⋆∥2 − ∥xk+1 −x ⋆∥2 − ∥xk+1 −x k∥2 \u0011 ≥ ⟨∇f(x k), xk+1 −x ⋆⟩.(74) By convexity,f(x ⋆)≥f(x k) +⟨∇f(x k), x⋆ −x k⟩,hence f(x k)−f(x ⋆)≤ ⟨∇f(x k), xk −x ⋆⟩.(75) Also, byL-smoothness, f(x k+1)≤f(x k) +⟨∇f(x k), xk+1 −x k⟩+ L 2 ∥xk+1 −x k∥2.(76) Subtractingf(x ⋆)from both sideds of (76) and combining if with (75), f(x k+1)−f(x ⋆)≤ ⟨∇f(x k), xk+1 −x ⋆⟩+ L 2 ∥xk+1 −x k∥2.(77) Sinceγ≤ 1/L, we have L 2 ≤ 1 2γ , so f(x k+1)−f(x ⋆)≤ ⟨∇f(x k), xk+1 −x ⋆⟩+ 1 2γ ∥xk+1 −x k∥2. Therefore, using (74) f(x k+1)−f(x ⋆)≤ 1 2γ \u0010 ∥xk −x ⋆∥2 − ∥xk+1 −x ⋆∥2 \u0011 , which is equivalent to ∥xk+1 −x ⋆∥2 ≤ ∥x k −x ⋆∥2 −2γ f(x k+1)−f(x ⋆) \u0001 .(78) 39 (iii). Summing (78) fromi= 0tok−1, we get 2γ k−1X i=0 f(x i+1)−f(x ⋆) \u0001 ≤ ∥x0 −x ⋆∥2. Sincef(x i)is nonincreasing by (70), we have"},{"citing_arxiv_id":"2605.07757","ref_index":35,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Efficient Verification of Neural Control Barrier Functions with Smooth Nonlinear Activations","primary_cat":"cs.LG","submitted_at":"2026-05-08T13:59:53+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"LightCROWN computes tighter Jacobian bounds for neural networks with smooth nonlinear activations by exploiting their analytical properties, raising verification success rates for neural control barrier functions up to 100% on benchmark control systems.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.07690","ref_index":41,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Fortifying Time Series: DTW-Certified Robust Anomaly Detection","primary_cat":"cs.LG","submitted_at":"2026-05-08T12:59:52+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":8.0,"formal_verification":"none","one_line_summary":"First DTW-certified robust anomaly detection for time series via randomized smoothing adapted through an l_p-to-DTW lower-bound transformation.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"In recent years, significant research has advanced the study of adversarial attacks and certified defenses for machine learning systems. Despite the considerable progress in adversarial robustness across various domains [42, 2, 45, 9, 12, 3], robustness intime-series anomaly detectionremains comparatively underexplored. As a core component of many safety-critical systems-including healthcare [25, 46, 21], finance [41, 22, 64], and mobile networks [56, 70, 35]- anomaly detectors are essential for identifying abnormal behavior in preventing failures or hazards. Robustness in this context is not merely a model performance concern but a core requirement for operational reliability. Recent work has revealed that time-series anomaly detectors are susceptible to adversarial attacks"},{"citing_arxiv_id":"2605.07631","ref_index":12,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Inference Time Causal Probing in LLMs","primary_cat":"cs.AI","submitted_at":"2026-05-08T11:59:13+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"HDMI is a new probe-free technique that steers LLM hidden states via margin objectives to achieve more reliable causal interventions than prior probe-based methods on standard benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.07590","ref_index":20,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Beyond Defenses: Manifold-Aligned Regularization for Intrinsic 3D Point Cloud Robustness","primary_cat":"cs.CV","submitted_at":"2026-05-08T11:02:53+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"MAPR improves adversarial robustness in 3D point cloud networks by aligning latent predictions with intrinsic manifold geometry via curvature/diffusion features and a consistency loss.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.07470","ref_index":6,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Uncovering Hidden Systematics in Neural Network Models for High Energy Physics","primary_cat":"cs.LG","submitted_at":"2026-05-08T09:17:58+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Neural networks for HEP tasks can be fooled at significant rates by subtle perturbations inside uncertainty envelopes, revealing hidden systematics not captured by conventional methods.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"In particular, DNNs may learn decision boundaries that depend on complex, nonlinear correlations that are not explicitly constrained by standard validation strategies. In the broader machine-learning literature, it is well established that high-capacity classifiers are vulnerable to adversarial perturbations. Methods such as DeepFool [5] and projected gradient descent (PGD) attacks [6] demon- strate that small, structured perturbations, often imperceptible in low-dimensional projections, can significantly alter model predictions. While originally studied in computer vision, the geometric mechanism underlying adversarial vulnerability, the sensitivity of complex decision boundaries to coherent deformations in high-dimensional space,"},{"citing_arxiv_id":"2605.06238","ref_index":10,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Band Together: Untargeted Adversarial Training with Multimodal Coordination against Evasion-based Promotion Attacks","primary_cat":"cs.LG","submitted_at":"2026-05-07T13:24:40+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"UAT-MC improves defense against evasion promotion attacks in multimodal recommenders by aligning gradients across modalities during untargeted adversarial training.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.04019","ref_index":2,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Redefining AI Red Teaming in the Agentic Era: From Weeks to Hours","primary_cat":"cs.AI","submitted_at":"2026-05-05T17:43:52+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"An agentic red teaming system automates creation of adversarial testing workflows from natural language goals, unifying ML and generative AI attacks and achieving 85% success rate on Meta Llama Scout with no custom human code.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.03491","ref_index":5,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Real-Time Evaluation of Autonomous Systems under Adversarial Attacks","primary_cat":"cs.AI","submitted_at":"2026-05-05T08:30:18+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"A framework trains and compares MLP, transformer, and GAIL-based trajectory models on real driving data, finding that architectural differences cause large variations in robustness to PGD attacks despite similar nominal accuracy.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.02109","ref_index":19,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Detecting Adversarial Data via Provable Adversarial Noise Amplification","primary_cat":"cs.LG","submitted_at":"2026-05-04T00:08:05+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"A provable adversarial noise amplification theorem under sufficient conditions enables a custom-trained detector that identifies adversarial examples at inference time using enhanced layer-wise noise signals.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.01701","ref_index":6,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Stability and Generalization for Decentralized Markov SGD","primary_cat":"cs.LG","submitted_at":"2026-05-03T03:58:19+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Decentralized SGD and SGDA under Markovian sampling admit non-asymptotic generalization bounds that incorporate network topology, Markov mixing rates, and primal-dual dynamics.","context_count":1,"top_context_role":"method","top_context_polarity":"use_method","context_text":"Lemma 4(Lemma 8, [Sunet al., 2021 ]).Suppose that Assumption 1 holds. Let{w t(i)}m i=1 andw t denote the local and averaged iterates of D-SGD at iterationt. Then \" mX i=1 wt −w t(i) 2 2 # 1 2 ≤2 √mL tX q=1 ηqλt−q. Lemma 5(Lemma 1, [Sunet al., 2018 ]).Suppose Assumption 3 holds. Letλ i(H)denote thei-th largest eigenvalue ofH, and defineλ(H) = max{|λ2(H)|,|λ n(H)|}+1 2 ∈[1/2,1), C H = Pm i=2 d2 i \u00011/2 ∥U∥ F U −1 F and KH = max    max 1≤i≤m      2di (di −1) \u0010 log \u0010 2di |λ2(H)|·log(λ(H)/|λ 2(H)|) \u0011 −1 \u0011 (di + 1) log (λ(H)/|λ2(H)|)      ,0    . There exist constantsC H >0andK H ≥0such that, for allt≥K H, Π∗ −H t ∞ ≤C H ·(λ(H)) t. Moreover, ifHis symmetric, thenK H = 0and Π∗ −H t ∞ ≤n 3/2 ·(λ(H)) t,∀t≥0. Lemma 6([Vershynin, 2018]).Let{z i}m i=1 be a sequence of (possibly dependent) random variables, and letξi =ξ i(z1, . . . , zi) satisfy|ξ i −E zi[ξi]| ≤b i. Then for anyα∈(0,1), with probability at least1−α, mX i=1 ξi − mX i=1 Ezi [ξi]≤ 2 mX i=1 b2 i log(1/α) ! 1 2 . Lemma 7([Schmidtet al., 2011]).Let{u t}t≥0 be a non-negative sequence satisfying u2 t ≤S t + t−1X τ=1 ατ uτ ,"},{"citing_arxiv_id":"2605.01579","ref_index":18,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Minimum Specification Perturbation: Robustness as Distance-to-Falsification in Causal Inference","primary_cat":"stat.ME","submitted_at":"2026-05-02T19:04:08+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"MSP quantifies the minimum changes to analyst choices required to falsify a causal claim by making its confidence interval contain zero, providing information orthogonal to dispersion-based robustness summaries.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.01462","ref_index":27,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"LocalAlign: Enabling Generalizable Prompt Injection Defense via Generation of Near-Target Adversarial Examples for Alignment Training","primary_cat":"cs.CR","submitted_at":"2026-05-02T14:25:26+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"LocalAlign generates near-target adversarial examples via prompting and applies margin-aware alignment training to enforce tighter boundaries against prompt injection attacks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.01449","ref_index":20,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"VisInject: Disruption != Injection -- A Dual-Dimension Evaluation of Universal Adversarial Attacks on Vision-Language Models","primary_cat":"cs.CR","submitted_at":"2026-05-02T13:56:50+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Universal adversarial attacks cause output perturbation 90 times more often than precise target injection in VLMs, with only 2 verbatim successes out of 6615 tests.","context_count":1,"top_context_role":"background","top_context_polarity":"unclear","context_text":"DeepSeek-VL: Towards real-world vision-language understanding, 2024. URLhttps://arxiv.org/abs/2403.05525. [19] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. InInternational Conference on Learning Representations (ICLR), 2018. URLhttps://arxiv.org/abs/1706.06083. [20] Mantas Mazeika, Long Phan, Xuwang Yin, Andy Zou, Zifan Wang, Norman Mu, Elham Sakhaee, Nathaniel Li, Steven Basart, Bo Li, David Forsyth, and Dan Hendrycks. HarmBench: A standardized evaluation framework for automated red teaming and robust refusal. In Proceedings of the 41st International Conference on Machine Learning (ICML), 2024. URL https://arxiv."},{"citing_arxiv_id":"2605.01306","ref_index":197,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Machine Learning Enhanced Laser Spectroscopy for Multi-Species Gas Detection in Complex and Harsh Environments","primary_cat":"physics.optics","submitted_at":"2026-05-02T07:28:27+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Machine learning methods including denoising autoencoders, unsupervised interference mitigation, blind source separation, and certifiable classification are developed and experimentally validated to improve multi-species laser spectroscopy under complex conditions.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"In response to this limitation, various approaches have been developed to enhance a model's ability to defend itself against adversarial attacks. Many heuristic defenses have been proposed to create models resistant to adversarial perturbations; however, some of these defenses have proven vulnerable to more sophisticated adversaries[193, 196]. Conse- quently, researchers have focused on strengthening empirical defenses[197] and developing certified defenses that offer robustness guarantees. Certified defenses ensure that classifiers deliver consistent predictions within a specified neighborhood of their inputs[198, 199, 200, 201, 202]. In critical domains such as gas sensing, the importance of model robustness, repeatabil- ity, and accuracy cannot be overstated. Consequently, effective and trustworthy approaches"}],"limit":50,"offset":0}