Guidelines for Artificial Intelligence Containment
read the original abstract
With almost daily improvements in capabilities of artificial intelligence it is more important than ever to develop safety software for use by the AI research community. Building on our previous work on AI Containment Problem we propose a number of guidelines which should help AI safety researchers to develop reliable sandboxing software for intelligent programs of all levels. Such safety container software will make it possible to study and analyze intelligent artificial agent while maintaining certain level of safety against information leakage, social engineering attacks and cyberattacks from within the container.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Safety from Honesty in a Disinterested AI Predictor
A disinterested Bayesian Predictor trained on contextualized statements has low probability of producing harmful agency because dangerous behaviors require rare coordinated underestimation of harm with no training sig...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.