Mesa-optimization arises when learned models act as optimizers with objectives that can differ from their training loss, creating alignment risks in advanced machine learning.
Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks
4 Pith papers cite this work. Polarity classification is still indexing.
abstract
Deep neural networks have emerged as a widely used and effective means for tackling complex, real-world problems. However, a major obstacle in applying them to safety-critical systems is the great difficulty in providing formal guarantees about their behavior. We present a novel, scalable, and efficient technique for verifying properties of deep neural networks (or providing counter-examples). The technique is based on the simplex method, extended to handle the non-convex Rectified Linear Unit (ReLU) activation function, which is a crucial ingredient in many modern neural networks. The verification procedure tackles neural networks as a whole, without making any simplifying assumptions. We evaluated our technique on a prototype deep neural network implementation of the next-generation airborne collision avoidance system for unmanned aircraft (ACAS Xu). Results show that our technique can successfully prove properties of networks that are an order of magnitude larger than the largest networks verified using existing methods.
representative citing papers
QLL is a novel logic for neuro-symbolic learning that uses ML-native operations (sum, log-sum-exp) on logits to embed constraints, satisfying most linear logic properties and showing stronger correlation between empirical robustness and formal verification than prior approaches.
λ-GELU learns layer-wise hardness parameters via constrained reparameterization to allow controlled post-training conversion from smooth GELU to ReLU activations.
ReLU networks' division of input space into convex polytopal regions permits direct extraction of causal rules that exactly match the original network's linear behavior in each region.
citing papers explorer
-
Risks from Learned Optimization in Advanced Machine Learning Systems
Mesa-optimization arises when learned models act as optimizers with objectives that can differ from their training loss, creating alignment risks in advanced machine learning.
-
Quantitative Linear Logic for Neuro-Symbolic Learning and Verification
QLL is a novel logic for neuro-symbolic learning that uses ML-native operations (sum, log-sum-exp) on logits to embed constraints, satisfying most linear logic properties and showing stronger correlation between empirical robustness and formal verification than prior approaches.
-
$\lambda$-GELU: Learning Gating Hardness for Controlled ReLU-ization in Deep Networks
λ-GELU learns layer-wise hardness parameters via constrained reparameterization to allow controlled post-training conversion from smooth GELU to ReLU activations.
-
Causal Explanations from the Geometric Properties of ReLU Neural Networks
ReLU networks' division of input space into convex polytopal regions permits direct extraction of causal rules that exactly match the original network's linear behavior in each region.