A hybrid randomized smoothing method yields a closed-form certificate for joint discrete-continuous perturbations that generalizes prior Gaussian and discrete smoothing approaches.
CoRR abs/2307.14936 (2023)
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 5roles
background 1polarities
background 1representative citing papers
ACC++ traces prompt-specific circuits in language models from one forward pass by extracting interpretable low-dimensional causal signals, revealing clustered mechanisms for indirect object identification and language-specific signals in multilingual settings.
MR-Adopt deduces input transformations from hard-coded MR test cases using LLMs, data-flow refinement, and output-relation selection to enable reuse with new source inputs.
An event-triggered consensus framework for heterogeneous robot swarms reduces communication overhead while preserving high task completion rates and resilience to failures in simulations.
The paper introduces a taxonomy of AI safety for LLMs organized into Trustworthy AI, Responsible AI, and Safe AI perspectives, accompanied by a review of state-of-the-art methods, challenges, and future directions.
citing papers explorer
-
Certified Robustness under Heterogeneous Perturbations via Hybrid Randomized Smoothing
A hybrid randomized smoothing method yields a closed-form certificate for joint discrete-continuous perturbations that generalizes prior Gaussian and discrete smoothing approaches.
-
Finding Interpretable Prompt-Specific Circuits in Language Models
ACC++ traces prompt-specific circuits in language models from one forward pass by extracting interpretable low-dimensional causal signals, revealing clustered mechanisms for indirect object identification and language-specific signals in multilingual settings.
-
MR-Adopt: Automatic Deduction of Input Transformation Function for Metamorphic Testing
MR-Adopt deduces input transformations from hard-coded MR test cases using LLMs, data-flow refinement, and output-relation selection to enable reuse with new source inputs.
-
Event-Triggered Adaptive Consensus for Multi-Robot Task Allocation
An event-triggered consensus framework for heterogeneous robot swarms reduces communication overhead while preserving high task completion rates and resilience to failures in simulations.
-
AI Safety Landscape for Large Language Models: Taxonomy, State-of-the-art, and Future Directions
The paper introduces a taxonomy of AI safety for LLMs organized into Trustworthy AI, Responsible AI, and Safe AI perspectives, accompanied by a review of state-of-the-art methods, challenges, and future directions.