#local-models

Jun 24, 2026

Install It, Plan It, Heal It

Phases 65, 71 and 72 of kodr: a controlled dependency install so generated apps can actually run, a plan-then-execute self-dev acceptance test (and the local tool-call bug it found), and a bounded self-healing repair loop that feeds real verification failures back to the model.

#ai#kodr#local-models#agents#automation#verification
Jun 23, 2026

Inspection as Tools, and a Plan You Can Read

Phases 62-64 of kodr: structural inspection handed to the model as two narrow tools, the same inspection exposed to the human in CLI and TUI, and a deterministic inspection plan that hands a small model the likely edit path instead of asking it to derive one.

#ai#kodr#local-models#agents#tools#context
Jun 22, 2026

Ranking and Budgeting What the Model Sees

Phases 59 and 61 of kodr: a deterministic ranked repo-map so the likeliest code lands first, and token-budget-aware packing so a small local model gets useful context without starving its own completion room.

#ai#kodr#local-models#agents#context
Jun 21, 2026

Scratchpads That Survive, Stages That Verify

Phases 57-58 of kodr: a planning scratchpad that carries between runs so plan-then-execute works on a small model, and staged execution that refuses to call one giant local-model dump a finished app until verification actually passes.

#ai#kodr#local-models#agents#memory#verification
Jun 20, 2026

kodr Edits kodr

Phases 54-56: three self-development trials where a local model edits the harness's own source. A scorecard of what broke, the harness fixes each failure forced, and the two-location bug that keeps catching small models.

#ai#kodr#local-models#agents#automation#testing
Jun 19, 2026

Inspection: kodr Learns to Read Code

Phases 51 and 53 of kodr: a zero-dependency structural code index across four languages, then an optional registry that lets real language servers enrich it - so the model reads before it writes, without the harness taking on a parser stack.

#ai#kodr#local-models#agents#context#tools
Jun 18, 2026

Contract Tests and a Web Channel Sketch

Phases 49-50 of kodr: pinning the channel boundary down with contract tests so CLI, TUI and web can never drift apart, then the smallest honest proof of a web UI - a local-only JSON server that is emphatically not a second harness.

#ai#kodr#local-models#agents#cli#testing
Jun 17, 2026

A Terminal UI, and Why It's Boring on Purpose

Phases 45-48 of kodr: a line-oriented TUI, an apply-review loop, a heartbeat for slow local calls, and Markdown session export - all in service of one idea, that the UI is just an adapter over a single shared request flow.

#ai#kodr#local-models#agents#cli#safety
Jun 16, 2026

Sessions: Write It Down, Pick It Up, Look Back

Phases 42-44 of kodr: a trilogy that turns one-shot runs into conversations - record the full transcript, resume it with --continue, then browse what you built. Plus the one-line bug that was quietly truncating every transcript.

#ai#kodr#local-models#agents#memory#context
Jun 15, 2026

Five Bugs, a Real Diff, and the Token Bill

Phases 39-41 of kodr: a review that found five plausible bugs hiding behind green tests (including an SSRF redirect bypass), a zero-dependency unified diff worth reading, and token usage finally shown where you look.

#ai#kodr#local-models#agents#security#cli
Jun 14, 2026

Evals, Scores, and Prompt Receipts

Phases 37-38 of kodr: a scored eval command so model regressions surface as a number instead of a squint, and prompt versioning so every run can be traced back to the exact prompt that produced it.

#ai#kodr#local-models#agents#verification#prompt-engineering
Jun 9, 2026

The CSV Example That Fought Back

Phases 24-29 of kodr: an MCP client seam, then one stubborn CSV expense example that refused to generate cleanly and dragged streaming, patch-oriented repairs, a regeneration, and memory scopes out of the harness on its way.

#ai#local-models#agents#mcp#memory#kodr
Jun 8, 2026

Two More Apps, and a Policy Gate

Phases 21-23 of kodr: generating a Markdown blog and a notes API as harness trials - each one shakes out a real bug in kodr itself - plus a permission policy gate that builds on the hooks layer.

#ai#local-models#agents#security#testing#kodr
Jun 3, 2026

A Deterministic Layer Around a Non-Deterministic Model

Phase 20 of kodr: pre_tool_use hooks - deterministic callbacks that can observe, mutate, or block a tool call before it runs, so policy lives in code instead of in a prompt the model might ignore.

#ai#local-models#agents#security#kodr
Jun 3, 2026

A Task List the Harness Can Read

Phase 19 of kodr: a small task-plan primitive so a run can say what it thinks is done, blocked, or still pending. Early days - there is a fair bit still to come here.

#ai#local-models#agents#automation#kodr
Jun 2, 2026

Autonomy, but on a Leash

Phase 14 of kodr: repeating a run for as long as it is useful, with a hard cycle count and explicit stop words so it never runs away.

#ai#local-models#agents#automation#kodr
Jun 2, 2026

Exercising the Harness

Phases 15-18 of kodr: a local install, replay and model comparison, a security-review hardening pass, and using a real generated app to shake out bugs.

#ai#local-models#agents#testing#kodr
Jun 2, 2026

One Repair, Then Stop

Phase 13 of kodr: when verification fails, let the model try to fix it exactly once - and no more.

#ai#local-models#agents#automation#kodr
Jun 1, 2026

Limits Before Tools

Phase 11 of kodr: giving the model real tools, but wrapping every one of them in a budget first.

#ai#local-models#agents#tools#kodr
Jun 1, 2026

Drawing the Workflow Before Hiring the Agents

Phase 12 of kodr: staged multi-agent coordination, modelled as plain deterministic data before a single extra model call is added.

#ai#local-models#agents#automation#kodr
May 31, 2026

The First Full Coding Loop

Phase 10 of kodr: the moment all the careful little modules connect into a real prompt-to-patch loop.

#ai#local-models#agents#automation#kodr
May 31, 2026

Running Checks Without Handing Over a Shell

Phase 09 of kodr: a verification runner that allowlists a handful of commands instead of giving the model a shell.

#ai#local-models#agents#safety#verification#kodr
May 30, 2026

Skills, but Just the Markdown

Phase 07 of kodr: SKILL.md files as reusable instructions, with the executable runtime deliberately left out.

#ai#local-models#agents#skills#kodr
May 30, 2026

Letting a Model Write Files Without Losing the Plot

Phase 08 of kodr: a path jail, dry-run diffs, and timestamped backups - the gate that sits between model output and your filesystem.

#ai#local-models#agents#safety#security#kodr
May 29, 2026

Context Is Just an Input

Phase 06 of kodr: build the workspace context for a prompt deterministically, then let yourself look at it before it ever hits the model.

#ai#local-models#agents#context#prompt-engineering#kodr
May 28, 2026

Defensive JSON Extraction

Local models love wrapping JSON in prose, fences, and broken escapes. Here is the little parser that survives them.

#ai#local-models#agents#parsing#kodr
May 27, 2026

Learning by Commit

A technique for learning complex things: get an agent to build it, then learn by watching it work

#ai#local-models#learning#agents#kodr
May 27, 2026

Probing Local Models and Building the Test Rig

Phases 02–04 of the kodr learning repo: connecting to LM Studio, faking a model server for tests, and capturing prompt run artifacts

#ai#local-models#learning#agents#lmstudio#testing#kodr

Install It, Plan It, Heal It

Inspection as Tools, and a Plan You Can Read

Ranking and Budgeting What the Model Sees

Scratchpads That Survive, Stages That Verify

kodr Edits kodr

Inspection: kodr Learns to Read Code

Contract Tests and a Web Channel Sketch

A Terminal UI, and Why It's Boring on Purpose

Sessions: Write It Down, Pick It Up, Look Back

Five Bugs, a Real Diff, and the Token Bill

Evals, Scores, and Prompt Receipts

The CSV Example That Fought Back

Two More Apps, and a Policy Gate

A Deterministic Layer Around a Non-Deterministic Model

A Task List the Harness Can Read

Autonomy, but on a Leash

Exercising the Harness

One Repair, Then Stop

Limits Before Tools

Drawing the Workflow Before Hiring the Agents

The First Full Coding Loop

Running Checks Without Handing Over a Shell

Skills, but Just the Markdown

Letting a Model Write Files Without Losing the Plot

Context Is Just an Input

Defensive JSON Extraction

Learning by Commit

Probing Local Models and Building the Test Rig