Three Sandboxes, Escalating Paranoia


A coding agent runs code it just generated. That’s the whole job, and it’s also the scary part. Across three phases kodr grew three execution boundaries, each containing a bit more than the last. I’m grouping them because they answer one question - “where does the untrusted stuff run?” - at escalating levels of paranoia.

Phase 60: OpenShell, upload-and-execute

The first sandbox treats OpenShell as an upload-and-execute backend, not a Docker bind-mount replacement. Kodr creates one persistent sandbox per run, uploads a filtered workspace snapshot, and routes dependency install, verification, run_command, and command hooks through openshell sandbox exec. The model call, proposal validation, safe writes, and artifacts stay host-side.

The security shape is the interesting part, because it refuses to be loose. It capability-probes sandbox create/exec/upload/delete, requires a running loopback gateway, and rejects remote gateways - they’d receive your workspace files. The upload snapshot excludes .git, .kodr, node_modules, secret files, and credential locations like .npmrc, .pypirc, .netrc, .cargo. Symlinks resolving outside the workspace are rejected. Without an explicit policy it writes a default-deny network policy. And it found real bugs in alpha software: openshell 0.0.20 didn’t expose the documented sandbox exec, so kodr failed before the model call with an actionable error rather than silently falling back to host execution - which confirmed that version checks are insufficient; probe the command surface directly. A later integration run denied an outbound https://example.com request under the generated default-deny policy (HTTP 403), which is exactly the proof you want.

Phase 76: Docker, the familiar one

Phase 76 adds an opt-in Docker boundary, keeping model calls and safe writes on the host for the first pass while routing installs, verification, and command tools through a container:

kodr run -p "task" --yes --docker-sandbox --test "npm test"

It uses node:24-bookworm-slim, mounts the workspace at /workspace, runs commands without a shell, defaults verification to --network none and installs to bridge (npm needs the registry). Every run records docker.json - image, network mode, mount, command metadata - and --docker-keep leaves containers around so you can poke at the exact environment after a failure. The example run earned its keep by exposing a non-Docker bug: Nemotron emitted duplicate top-level files keys, plain JSON.parse kept the last (empty) one, and kodr happily ran the old tests and called it OK - a false positive. So extraction now rejects duplicate top-level JSON keys. The retry caught a real syntax error in the generated code under --network none, healed it, and passed.

Phase 88: OpenShell worker mode, all the way in

The first two keep the harness on the host and only push effects into the sandbox. For riskier features that’s not the shape you want. Phase 88 adds --openshell-worker: the host kodr process becomes a launcher. It creates the sandbox, uploads the workspace to /sandbox and the kodr runtime to /kodr, then executes a nested node /kodr/bin/kodr.mjs run ... inside the sandbox, downloading only the nested worker-run artifacts.

That boundary is the point. A bad tool call, an accidental write, a prompt-injection-driven command - all of it now happens against the sandbox copy of the workspace. The host checkout isn’t overwritten by arbitrary sandbox state; reviewed writeback is a separate future step. It’s deliberately kept distinct from --openshell-sandbox (effects-only stays useful for quick verification), and it’s honest about what it hasn’t solved: provider secrets. The nested worker gets model and base-URL flags but no API keys yet - the next step is a host-owned relay so the sandbox talks to one narrow local endpoint while keys stay home. Worker mode is the containment baseline I’d want before turning on skill code execution.

Links: