OpenAI 20260513 Building a Safe, Effective Sandbox to Enable Codex on Windows Summary

Generated by Codex with GPT-5

What happened

OpenAI’s official engineering blog published Building a safe, effective sandbox to enable Codex on Windows, a post about the operating-system engineering needed to make local coding agents useful on Windows without giving them unchecked access to a developer machine.

The problem is specific to agentic coding tools. Codex runs on a user’s laptop through the CLI, IDE extension, or desktop app, while the model itself runs in the cloud. The local harness can ask the operating system to run shell commands, read files, write files, run tests, invoke build tools, install dependencies, or create Git branches. By default, those commands inherit the real user’s permissions. That is powerful enough to be useful and dangerous enough to need an OS-enforced boundary.

OpenAI’s desired default policy is familiar from the Codex experience on other platforms: allow broad reads, allow writes only inside the workspace and configured writable roots, and block internet access unless the user opts in. On macOS, Seatbelt can enforce this shape. On Linux, tools such as seccomp and bubblewrap can do much of the job. Windows did not provide a single built-in primitive that matched “let an open-ended coding agent behave like a developer, but only inside these boundaries.”

The post is useful because it walks through the failed fits before describing the final architecture. AppContainer offered a real isolation boundary, but it assumes a much more known-ahead-of-time app capability model than Codex can use. Windows Sandbox provided strong VM-style isolation, but Codex needs to operate on the user’s real checkout, tools, environment, and terminal workflow rather than a disposable desktop with host-guest bridging. Mandatory Integrity Control looked promising because low-integrity processes cannot normally write to medium-integrity objects, but relabeling the real workspace as low integrity would change the trust semantics of the user’s actual files for every low-integrity process, not just Codex.

OpenAI’s first prototype was an unelevated sandbox built from synthetic SIDs, ACLs, and write-restricted tokens. The sandbox setup created a synthetic sandbox-write SID, granted it write, execute, and delete access to the current working directory and configured writable roots, and explicitly denied it write access to sensitive read-only locations inside the workspace such as .git, .codex, and .agents. Codex then launched commands under a write-restricted token whose restricted SID list included Everyone, the current logon session SID, and sandbox-write.

That design gave Windows two checks for every write. The normal user identity still had to be allowed to perform the operation, and at least one restricted SID on the token also had to be allowed. In practice, this let the sandbox express “the real user can read broadly, but this process tree can only modify explicitly writable locations.” It was granular, did not require admin setup, and preserved enough compatibility for normal developer tools.

The weak point was network access. Without elevation, OpenAI could not rely on Windows Firewall to block outbound traffic for the sandboxed process tree. The prototype instead tried to fail closed for common developer tools by setting proxy variables to a dead local endpoint, forcing Git HTTP traffic through the same dead endpoint, disabling Git-over-SSH with GIT_SSH_COMMAND, and prepending a small denybin directory so stub SSH and SCP commands resolved before the real binaries. That stopped a lot of ordinary tool-driven network traffic, but it was advisory. Any process that ignored proxy variables, bypassed PATH, or opened sockets directly could escape it.

The production design therefore accepts an elevated setup step. The current Windows sandbox still runs child commands under a write-restricted token, but the principal is no longer the real Windows user. Codex creates two local users: CodexSandboxOffline, which is targeted by firewall rules, and CodexSandboxOnline, which is used when network access is allowed. That change is the key move. Windows Firewall can express user-scoped outbound blocking for a real local user, so the network policy can now apply to all descendant commands the agent spawns, including Git, Python, package managers, and build tools.

That choice creates setup complexity. A dedicated codex-windows-sandbox-setup.exe binary creates the synthetic SID, creates the online and offline sandbox users, stores their credentials locally encrypted with Windows DPAPI, and creates or validates firewall rules for the offline user. Because those sandbox users are not the real user, Codex also has to preserve read usefulness. Many directories grant read access to “Authenticated Users,” but profile directories and other developer-relevant paths may not. OpenAI handles this with best-effort read ACL grants for common locations such as C:\Users\<real-user>, C:\Windows\, C:\Program Files\, C:\Program Files (x86)\, and C:\ProgramData\, with expensive ACL work run asynchronously so setup does not block longer than necessary.

The final process-launch path is another important implementation detail. The natural flow would have been for codex.exe, running as the real user, to log on as the sandbox user, create a restricted token, and call CreateProcessAsUserW for the final command. That ran into a Windows privilege boundary: the real-user process could create the restricted token but could not reliably launch the child with it. OpenAI solved this by adding codex-command-runner.exe. codex.exe launches the runner as the sandbox user with CreateProcessWithLogonW; the runner then opens its own sandbox-user token, extracts the sandbox logon SID, creates the restricted token, and starts the real child process with CreateProcessAsUserW.

The resulting architecture has four layers: the normal unelevated codex.exe harness, the elevated setup binary, the sandbox-user command runner, and the final child command. It is more complex than the first prototype, but each layer exists to preserve a specific property: keep the main harness normal, put privileged setup behind an explicit boundary, make firewall rules attach to a real principal, and make file-write policy follow the process tree through restricted tokens.

Why it matters

The broader engineering lesson is that coding-agent security is not just application security with an AI label. A local coding agent is intentionally allowed to run arbitrary developer workflows. It is expected to invoke shells, compilers, test runners, language servers, package managers, scripts, and tools the product team cannot enumerate in advance. A sandbox that is too narrow breaks the product. A sandbox that is too permissive turns a model mistake or malicious dependency into a local compromise path.

OpenAI’s Windows design is a case study in aligning policy with the operating system’s actual enforcement model. The unelevated prototype had a good file-write story because write-restricted tokens and ACLs naturally expressed the workspace policy. It did not have a good network story because environment variables are conventions, not enforcement. The elevated design changes the identity model so that firewall policy can be attached to the sandboxed process tree through a real local user. That is the difference between asking tools to behave and making the OS deny the action.

The separate online and offline users are also a pragmatic product choice. Coding agents sometimes need the network to install dependencies, fetch documentation, or run remote tooling, but the default should not silently allow exfiltration. By representing network mode as a different sandbox principal, Codex can keep the offline case strongly blocked while still having a clear path for user-approved network access.

There is also a useful maintainability pattern in the split binaries. The setup binary owns UAC crossing, local-user creation, firewall rules, credential storage, and longer-running ACL work. The command runner owns the awkward Windows token boundary. The main Codex process stays closer to a normal cross-platform harness. That separation reduces the chance that Windows-only privileged machinery spreads into the rest of the product.

The post also shows why “safe by default” often means accepting some operational complexity. The final design needs local accounts, encrypted credentials, firewall validation, async read-permission work, and a specialized runner. Those are not incidental details; they are the cost of making the sandbox both enforced and useful. For a coding agent, a theoretically cleaner sandbox that cannot run real developer workflows will push users toward full-access mode, which is worse security in practice.

Takeaway

OpenAI’s Codex Windows sandbox is a concrete example of agent safety moving below prompts and product policy into OS-level systems work. The safe behavior comes from constrained process tokens, ACLs, dedicated sandbox principals, firewall rules, and a carefully staged launch path. The model is not trusted to behave safely because it has been instructed to do so; the commands it asks to run are placed inside an operating-system boundary.

For teams building local agents, the takeaway is to model the agent as an untrusted process tree that still needs to be productive. Reads, writes, network, credentials, and child processes all need explicit policy. Where the host OS does not expose exactly the right primitive, the system has to compose available mechanisms until the enforcement boundary matches the product boundary closely enough.

This is also a signal about where developer platforms are headed. General-purpose operating systems were not designed around autonomous coding agents acting on real workspaces. As those agents become ordinary development tools, first-class sandboxing primitives for open-ended local automation will matter as much as model quality. The useful agent is not the one that can run anything. It is the one that can do real work while the operating system keeps its blast radius legible.