Two More Apps, and a Policy Gate


Last time I made the case that generating example apps isn’t an eval - it’s a way to exercise kodr until something breaks. Phases 21 and 23 are exactly that, twice more, and both broke it in instructive ways. Phase 22 is the odd one out: a permission policy gate that follows straight on from the hooks layer.

App two: a Markdown blog (phase 21)

The second example is a Markdown blog generator - frontmatter, escaping, sorting, static HTML out. A clear step up from the todo CLI.

The local model failed to generate it. Twice. The first dry run died with fetch failed and left nothing useful behind - no context, no prompt, no run evidence at all. That’s the bug. A model run that falls over should still leave a paper trail, otherwise you can’t tell a flaky network from a bad prompt from a model that simply gave up.

So kodr now writes failure artifacts even when the model run itself fails: context.md, prompt.md, summary.json, an error.json, and empty response/write/test files so the run directory has a consistent shape. The retry failed too - but this time it failed legibly, with a full artifact directory I could go read. I finished the blog generator by hand so the repo still gets the fixture, but the win was the closed gap, not the app.

App three: a notes API (phase 23)

The third example is a small HTTP notes API - routing, JSON parsing, validation, status codes, persistence, real integration tests. More moving parts again.

This time the model produced a valid proposal. It just only wrote package.json. And kodr reported the run as verified, because the verification command was node --test, and node --test against a project with zero tests finds nothing, runs nothing, and exits 0. Green on an empty suite. That’s the worst kind of pass: confident and meaningless.

The verification runner now treats a Node test run that reports tests 0 as a failure, while leaving plain node --check behaviour alone. An example with no tests is no longer allowed to claim it passed. The finished notes API lives under examples/notes-api with actual integration tests that make real HTTP requests.

Two apps, two harness bugs that only surfaced because I tried to build something real. That’s the whole reason these example trials exist.

The policy gate (phase 22)

Phase 22 comes from a different angle. Hooks gave kodr a deterministic lifecycle boundary - code that runs around every tool call. Permission policy is the first real consumer of that boundary: a policy object that makes allow/deny decisions before tool effects happen.

The important part is that it changes none of the defaults. File writes are still jailed by the safe-write path logic. Verification commands still go through the allowlist. Public fetches still get private-address blocking. Writes still dry-run unless you explicitly apply. The policy gate only lets you narrow that - deny reads, writes, apply, commands, network, or specific hosts - and it sits in front of the existing hardening, not in place of it.

It lands in ToolRunner, the same chokepoint everything else converges on. There’s no CLI or project-file configuration for it yet; that’s later. For now the contract exists and is tested, which is the order I like: build the boring, certain layer first, expose the knobs once the shape is proven.

That’s the through-line across all three phases, really. Let the model try to build things and watch where the harness lets you down. Then make the harness deterministically harder to fool.

Links: