Generated by Codex with GPT-5
What happened
Microsoft’s official Security Blog published Introducing RAMPART and Clarity: Open source tools to bring safety into Agent development workflow, a May 20, 2026 post about turning agent safety from an occasional review into a set of engineering artifacts that can live in a repository, run in CI, and evolve with the system.
The post’s core idea is that enterprise agents are no longer just text generators. They retrieve documents, read email, query customer systems, write and execute code, and take actions across connected tools. That changes the failure model. A bad answer is still a problem, but an unsafe action can leak data, modify state, trigger workflows, or carry an injected instruction from one context into another. Microsoft is responding with two open-source tools: RAMPART, a safety test framework for agentic systems, and Clarity, a structured design assistant for examining assumptions before a system is built.
RAMPART is the more directly test-like half of the pair. It is built on top of Microsoft’s PyRIT red-teaming framework, but it is aimed at engineers working during product development rather than security researchers doing black-box discovery after the fact. The developer experience is deliberately close to integration testing: teams write pytest tests that encode scenarios from their threat model, connect those tests to the agent through a thin adapter, drive an interaction, and evaluate what happened. The result is a pass/fail signal that can gate a pull request or CI pipeline.
That sounds ordinary until the target is an LLM agent. Conventional deterministic tests are a poor fit for a system whose behavior can vary across runs and whose failures often show up as side effects rather than returned values. RAMPART accounts for that by supporting statistical trials, so a test can express a policy such as requiring safe behavior in a defined share of repeated runs. It also focuses its evaluators on observable agent behavior: which tools were invoked, what side effects occurred, and whether those actions stayed inside the boundaries the team expected.
The first mature coverage area is cross-prompt injection. That is a practical choice because many production agents consume untrusted or semi-trusted content: documents, tickets, emails, web pages, CRM notes, and records originally written by someone outside the agent’s control. RAMPART supplies attack strategies, adversarial payload generation, and evaluation logic, while the test author expresses what the agent should and should not do. When a red-team exercise or production incident finds a new pattern, the finding can be turned into a regression test and kept alive across later changes.
Clarity works earlier in the lifecycle. Its purpose is to slow down the parts of software work that should be slow: problem framing, solution selection, failure analysis, and decision tracking. The post frames this as especially important in an era where coding agents make implementation cheap. If a team builds the wrong thing quickly, the expensive error may be an architectural assumption that no one questioned before code started landing.
The mechanism is a repository-native protocol. Clarity can run as a desktop app, a web UI, or inside a coding agent, and it writes the results of structured conversations into a .clarity-protocol/ directory as human-readable Markdown. Those files capture the problem statement, solution rationale, failure analysis, and key decisions. Because they live beside the code, they can be reviewed in pull requests, diffed over time, and revisited when requirements change.
The failure-analysis piece is the most interesting part. Clarity uses multiple AI “thinkers” to inspect the proposed system from different angles, including security, human factors, adversarial scenarios, and operations. The team then groups related failures, traces causal chains, and builds mitigation plans. Clarity also tracks staleness across the design documents as a dependency graph, so a change to the problem statement can prompt review of the solution description, failure analysis, and recorded decisions that depended on it.
Why it matters
The broader engineering move is to make agent safety continuous. Red-team reports, design reviews, and incident retrospectives often produce useful knowledge, but that knowledge can remain trapped in documents or memory. RAMPART converts part of it into runnable tests. Clarity converts part of it into reviewable design artifacts. Together, they make safety work look less like an external approval gate and more like the normal software loop: write down intent, encode assumptions, run tests, review diffs, and keep regressions from reappearing.
That matters because agent systems fail at several layers. Some failures are prompt-level, such as obeying injected instructions found in retrieved content. Some are tool-level, such as calling an API with the wrong scope or writing to the wrong record. Some are product-level, such as granting the agent access to a capability that should never have been available in that workflow. RAMPART mainly pressures the runtime behavior. Clarity pressures the design assumptions that decide what runtime behavior is even possible.
The post also captures a subtle evaluation lesson. For ordinary software, a unit test can often assert a specific value after one execution. For an agent, a useful safety test may need repeated trials, behavioral thresholds, and composable evaluators over tool calls and side effects. A single transcript is weak evidence. A distribution of outcomes under adversarial scenarios is closer to the operational question teams actually care about: whether the system remains inside bounds often enough, under enough realistic variation, to justify deployment.
There is a practical ownership shift as well. Microsoft explicitly places these tools in the hands of product engineers. That is not a replacement for dedicated red teaming, but it changes the handoff. Instead of a security team finding an issue and hoping the lesson persists, the product team can encode the scenario as a test in the same pull request that adds a new tool, data source, or mitigation. The result is a tighter loop between threat modeling, implementation, and regression coverage.
Takeaway
RAMPART and Clarity are a useful signal of where agent engineering is heading. The hard part is not only getting a model to act; it is making the action loop inspectable, testable, and governed by design intent that the team can keep current.
For teams building AI agents, the lesson is to treat safety as part of the product’s engineering substrate. Agent access, tool permissions, side effects, human approval points, and adversarial content paths should become explicit design artifacts and executable tests. The tools will only be as good as the scenarios, adapters, and evaluators teams write, but the direction is right: agent safety needs to live in the same lifecycle as code.