Ranking and Budgeting What the Model Sees

Two phases of kodr about the most underrated lever in agent work: not what the model is, but what you put in front of it. Once you have a structural index, the next questions are which chunks and how many.

Phase 59: a ranked repo-map

--inspect-context could find relevant chunks, but it treated them as a flat set. Phase 59 ranks them. The ranking is intentionally lexical - no semantic model, no new parser dependency. Each symbol scores from query-match strength, lexical reference count, symbol-kind weight, and deterministic path/line/name tie-breakers. The result is a rankedSymbols list; existing symbols consumers keep working, but packing can now include the most relevant chunks first.

This phase also wired the external inspector registry into run context, and a cycle review caught a sharp edge: external inspectors usually don’t return cached source lines, so wholesale-replacing a file’s built-in scan with external structure would erase the lexical reference counts the ranker needs. The merge now keeps the built-in content lines while accepting the external structure. The guiding constraint is predictability - the repo-map isn’t trying to be clever, it’s trying to be stable, cheap, and good enough to put likely target code before noise for a small model.

Phase 61: budget the context like it costs something

Because it does. Kodr knew how to build helpful structural context, but it accepted the selected chunks as if every model could afford them. That’s exactly where small local models turn brittle: the context looks helpful while quietly stealing the completion room the model needs to actually finish. Phase 61 makes the context window an input to packing rather than a background assumption.

The packer now computes a deterministic budget: the active context window (from the model profile or --context-window), a completion reserve (profile or --completion-reserve), a rough four-chars-per-token conversion, and the existing hard character cap so legacy runs don’t suddenly balloon. Inspection chunks get selected in ranked order until the budget fills. If the very first chunk is too big, kodr truncates it rather than emitting nothing. Everything over budget is counted as dropped context and surfaced in the rendered summary, and run summaries carry the budget block - so a failed run can show whether the model was starved, over-packed, or just working with a small profile:

kodr run -p "Inspect this API" \
  --context-window 65536 \
  --completion-reserve 4096

The design choice I care about most: budgeting is deterministic. There’s no model call deciding relevance and no hidden provider behaviour in the packer. That keeps it testable and keeps local-model runs reproducible - the same workspace and prompt pack the same way every time. (The context-window and completion-reserve numbers come from the model profile registry, which gets its own post shortly - configuration as harness behaviour, not cosmetics.)

Links:

Phase docs: 59-ranked-repo-map, 61-token-budget-aware-context
Kodr blog: 59, 61
Context packing: src/context-packer.mjs