N e M o Guardrails: A Toolkit for Controllable and Safe LLM Applications with Programmable Rails

Rebedea, Traian, Dinu, Razvan, Sreedhar, Makesh Narsimhan, Parisien, Christopher, Cohen, Jonathan · 2023 · DOI 10.18653/v1/2023.emnlp-demo.40

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

open at publisher browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

MCPHunt: An Evaluation Framework for Cross-Boundary Data Propagation in Multi-Server MCP Agents

cs.AI · 2026-04-30 · unverdicted · novelty 8.0

MCPHunt benchmark finds 11.5-41.3% policy-violating credential propagation in multi-server MCP agents across five models, reducible up to 97% by prompt mitigations while retaining most utility.

PRISM: Generation-Time Detection and Mitigation of Secret Leakage in Multi-Agent LLM Pipelines

cs.AI · 2026-05-11 · unverdicted · novelty 6.0

PRISM detects and stops credential leakage during LLM generation in multi-agent pipelines using per-token risk scores from lexical, structural, and behavioral signals, achieving zero observed leaks and F1 of 0.832 on a 2000-task benchmark.

ADR: An Agentic Detection System for Enterprise Agentic AI Security

cs.AI · 2026-05-17 · unverdicted · novelty 5.0

ADR is a three-component detection system for AI agents that combines telemetry sensors, red teaming, and two-tier detection, achieving 97.2% precision in a ten-month Uber deployment and outperforming baselines on the new ADR-Bench.

Symbolic Guardrails for Domain-Specific Agents: Stronger Safety and Security Guarantees Without Sacrificing Utility

cs.SE · 2026-04-16 · unverdicted · novelty 5.0

Symbolic guardrails enforce 74% of specified safety policies in agent benchmarks and boost safety without hurting utility.

TWGuard: A Case Study of LLM Safety Guardrails for Localized Linguistic Contexts

cs.CR · 2026-04-17 · unverdicted · novelty 4.0

TWGuard achieves +0.289 F1 improvement and 94.9% false-positive reduction for LLM safety guardrails in the Taiwan linguistic context compared to foundation models and baselines.

citing papers explorer

Showing 5 of 5 citing papers.

MCPHunt: An Evaluation Framework for Cross-Boundary Data Propagation in Multi-Server MCP Agents cs.AI · 2026-04-30 · unverdicted · none · ref 4
MCPHunt benchmark finds 11.5-41.3% policy-violating credential propagation in multi-server MCP agents across five models, reducible up to 97% by prompt mitigations while retaining most utility.
PRISM: Generation-Time Detection and Mitigation of Secret Leakage in Multi-Agent LLM Pipelines cs.AI · 2026-05-11 · unverdicted · none · ref 26
PRISM detects and stops credential leakage during LLM generation in multi-agent pipelines using per-token risk scores from lexical, structural, and behavioral signals, achieving zero observed leaks and F1 of 0.832 on a 2000-task benchmark.
ADR: An Agentic Detection System for Enterprise Agentic AI Security cs.AI · 2026-05-17 · unverdicted · none · ref 44
ADR is a three-component detection system for AI agents that combines telemetry sensors, red teaming, and two-tier detection, achieving 97.2% precision in a ten-month Uber deployment and outperforming baselines on the new ADR-Bench.
Symbolic Guardrails for Domain-Specific Agents: Stronger Safety and Security Guarantees Without Sacrificing Utility cs.SE · 2026-04-16 · unverdicted · none · ref 57
Symbolic guardrails enforce 74% of specified safety policies in agent benchmarks and boost safety without hurting utility.
TWGuard: A Case Study of LLM Safety Guardrails for Localized Linguistic Contexts cs.CR · 2026-04-17 · unverdicted · none · ref 2
TWGuard achieves +0.289 F1 improvement and 94.9% false-positive reduction for LLM safety guardrails in the Taiwan linguistic context compared to foundation models and baselines.

N e M o Guardrails: A Toolkit for Controllable and Safe LLM Applications with Programmable Rails

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer