Most of kodr’s phases are me deciding to add a thing. This batch is mostly the opposite: a single example app that would not generate cleanly, and every time I tried to force it, the harness coughed up another missing piece. Six phases, five of them downstream of one annoying CSV file. The sixth, MCP, is the one bit of deliberate roadmap in here, so let’s get it out of the way first.
The MCP seam (phase 24)
Phase 24 adds the first MCP-shaped extension point - without taking on process transport yet. Kodr can now register providers, discover their tools, and call them by stable encoded names like mcp:provider:tool. A provider is a deliberately tiny object with listTools() and callTool(), which means I can test the whole contract against a fake provider before any real server lifecycle exists.
ToolRunner grows a list_mcp_tools and learns to dispatch mcp:* names, while local built-in tools stay exactly as they were. Crucially the hooks still wrap the call, so policy, logging, and blocking apply to provider-backed tools through the same lifecycle as everything else.
Worth being blunt about how little this is. There’s no transport and no auth here at all. Nothing speaks the MCP wire protocol, nothing spawns a server, nothing does a JSON-RPC handshake, and there’s certainly no OAuth-secured remote provider. A “provider” is just an in-process object with the right two methods. What I’ve built is the shape of MCP - the mcp:provider:tool naming and the call contract - wired into the harness so the real thing has somewhere to land.
I did start looking at a proper MCP implementation as a whole, and it’s actually quite a beast, hard to get right in itself. So this was a deliberately small starting slice. I’m also considering leaning on the LM Studio Remote MCP setup to scoot around a lot of this complexity - let something that already speaks the protocol handle servers and auth, rather than me reimplementing it. I’ll circle back to that later. The hard part - starting configured servers, speaking the protocol, handling lifecycle failures, deciding which provider tools policy allows - is all still ahead. This phase just establishes the local seam so the rest has somewhere to plug in.
Then the CSV example happened (phase 25)
The fourth example app is a CSV expense analyzer: parse quoted, escaped delimited text, validate records, aggregate by month and category, print a report. Kodr tried to generate it one-shot and the LM Studio request died after several minutes with fetch failed. Same failure class I’d seen before - except this time, instead of finishing the app by hand and moving on, I let it expose something deeper.
Provenance, and a streaming fix (phase 26)
Here’s the process bug: when the first generation fails and I quietly hand-complete the fixture, the example stops being an honest kodr sample. It looks like the harness produced it. It didn’t. That’s a lie baked into the repo.
So phase 26 introduces provenance. An example now has to carry the receipts - the prompts, run artifacts, verification results, and follow-up slices that actually produced or repaired it. A failed one-shot is not the end of the loop; it either improves the harness or gets split into smaller kodr runs you can inspect.
Forcing that honesty immediately paid out a harness fix: streaming. The CSV parser slice kept failing over the normal request path, then went green the moment I added --stream. That turned a hidden manual replacement into a real kodr-produced change. The CLI also stopped lying in a smaller way - it now prints Run failed when a summary is ok: false, instead of cheerfully reporting Run ok.
The same slices surfaced the next problem, though: full-file rewrites are a terrible unit for tiny fixes. One slice fixed a diagnostic assertion but regressed the escaped-quote fixture. Another fixed syntax and regressed the fixture again. The model kept improving one thing while breaking another, because a whole-file rewrite gives it that much room to wander.
Patch-oriented repairs (phase 27)
That pointed straight at a narrower proposal format:
{
"patches": [{ "path": "src/file.mjs", "search": "exact current text", "replace": "replacement text" }]
}
A patch applies only when its search text matches exactly once in the current file. Stale or ambiguous edits fail before they touch the workspace, and patches go through the same path jail and permission checks as full writes. Failed patches still leave artifacts to read.
The first live patch repair found a wonderfully practical wrinkle: model output sometimes arrives with double-escaped newlines. So kodr has a conservative fallback that normalizes those escapes only when the original matches zero times and the normalized version matches exactly once. Narrow, deterministic, no guessing.
The regeneration (phase 28)
With patches in hand, phase 28 is the do-over: rebuild the CSV example as a clean kodr sample instead of a hand-stabilized fixture, keeping every attempt in provenance. It was not instant. The first patch prompt asked for a source change and a test change together and got back one valid patch plus one malformed one - now recorded as invalid-proposal artifacts rather than silently dropped. The next prompt drifted whitespace in the search text, which bought another conservative fix: a whitespace-tolerant fallback that only fires when a same-line-window match is unique.
After those, three streamed patch runs landed cleanly - parser validation, node:test coverage for it, and a README plus sample-data refresh. The example teaches the same thing it always did, but now its changes are backed by runs you can inspect, and its failures explain why the harness got stricter.
Memory and scratchpad (phase 29)
The redo left one thing obviously missing: the model had nowhere durable to keep intent. What already failed, what to retry, which preferences should persist - all of it evaporated between runs. Phase 29 adds three explicit scopes:
- Project memory -
KODR_MEMORY.md, committed with the repo, loaded as untrusted project guidance. - Private user memory -
.kodr/memory/user.md, local-only, because.kodris ignored by context walking and shouldn’t be committed by default. - Run scratchpad -
scratchpad.mdin each run’s artifact directory, filled from an optionalscratchpadstring in the model’s proposal.
The split is the point. Scratchpad notes are recorded as artifacts, never applied to the workspace - transient repair thinking that doesn’t leak into repo state. Project memory is visible and reviewable. User memory stays out of commits. Persistence becomes a decision instead of an accident.
And that’s the arc: one CSV file I couldn’t generate cleanly is why kodr now has provenance discipline, streaming, patch-oriented repairs, and memory scopes. The example was never the deliverable. The pressure it put on the harness was.
Links:
- Phase docs: 24-mcp-client, 25-csv-expense-example, 26-example-provenance, 27-patch-oriented-repairs, 28-csv-expense-regeneration, 29-memory-and-scratchpad
- The MCP client: src/mcp-client.mjs
- Streaming completions: src/model-client.mjs
- Patch application: src/safe-writes.mjs
- Memory scopes: src/memory.mjs