Inspection: kodr Learns to Read Code


The next useful jump for kodr wasn’t another way to call the model. It was better reading before the model is asked to change anything. If you’ve used inspect-json you know I have a soft spot for “look at the thing properly first” - this is the same instinct, pointed at source code.

Phase 51: a structural index

kodr inspect walks the repo (the same deterministic inventory context packing uses), classifies source files, and emits a normalized index: files with language, imports, line count and symbols, a flattened symbol list with line spans, and optional lexical references when you pass --symbol <name>. It’s read-only and never calls a model.

Zero dependencies, four languages, which forces an honest design: don’t pretend one parser fits everything. Each language gets a small adapter.

  • JS/TS: skip strings and comments, track braces around function, class, export, arrow functions, and test/it/describe.
  • Python: indentation-aware - imports, top-level classes and functions, test_ functions.
  • Rust: use, mod, fn, struct, enum, trait, impl, #[test].
  • Go: package, import, func, type, and Test* in *_test.go.

The crucial design choice is the index shape, not parser completeness. It is explicitly a structural index, not a semantic language engine - and being honest about that is what makes it shippable in one sitting. A follow-up wired it into packing via --inspect-context: build the index, use the prompt as a selection query, include matching symbol definitions plus nearby imports, reference windows, and related tests - with a fallback to compact file summaries when nothing matches, so the model never gets nothing. The repo smoke test immediately caught a bug worth keeping: indented variables inside a function were being indexed as symbols, so a whole function got selected as a one-line chunk. Tightening symbol boundaries to top-level lines fixed it.

Phase 53: let the real tools in, optionally

Regex scanning is portable and fast but shallow. A real language server knows types, cross-file references, and correct scope. The question is how to use one without coupling kodr to it. The answer is a registry of plain descriptors:

{
  name: 'gopls',
  languages: ['go'],
  command: 'gopls',
  buildArgs: (files, cwd) => ['--json', ...files],
  adapt: (stdout) => [/* normalized InspectedFile[] */],
  timeout: 10000,
  onFailure: 'skip',
}

The adapt function maps whatever the tool emits into the same { path, language, lineCount, imports, symbols } shape inspect already returns. That normalized shape is the only contract - tool-specific formats never leak into the rest of the harness. Availability is checked by spawning the command with --version and treating only ENOENT as “missing” (any other complaint means it’s present, just grumpy about the flag) - no which dependency. External results take precedence per file they cover; uncovered files keep the built-in scan; and onFailure: 'skip' is the default, so a timed-out or absent tool is silently ignored and the built-in index stands. That’s what makes every external inspector genuinely optional: zero-dep should mean kodr runs without a parsing stack, not that it ignores one when present.

Tests use fake descriptors backed by node -e one-liners emitting controlled JSON, so the suite needs no external tools. And the proof it actually helps: a wordfreq Go program generated by kodr against a local qwen - one shot, 14k tokens, three files, all 9 Go tests passing with no edits. The model named a testable analyze function, added a merge helper, and wrote three test suites unprompted. The built-in index gave it enough workspace shape to nail a Go project on the first try.

Links: