Substrate is the persistent knowledge layer between autonomous agents and your codebase.

Agents need a substrate, not a haystack.

Substrate indexes your repo into a structural graph (functions, classes, imports, calls, inheritance) and exposes it over MCP, so Claude Code can answer questions like "what breaks if I change X?" with complete, deterministic, millisecond answers, instead of grepping files and guessing.

View on GitHub See the graph →

v0.1 · Python, TypeScript/JavaScript, Terraform · MCP over stdio

The problem

Long context isn't the same as long memory.

Modern agents read files like a junior engineer with infinite patience and a shrinking attention budget. Three named failure modes compound every time the context grows.

1. Context rot

Every additional token spreads attention thinner. On two-hop reasoning, questions that require chaining two facts, accuracy collapses well before the documented context window runs out.

Chart showing accuracy on two-hop reasoning questions dropping as context length grows, across GPT-4o, Claude 3.5 Sonnet, Gemini 2.5 Flash, and Llama 4 Scout. — The longer the context got, the worse models tended to do on questions like this that required two reasoning hops. — Timothy B. Lee, *Context rot*

On a 32K-token retrieval task, GPT-4o slips from 99% to 70%; Claude 3.5 Sonnet from 88% to 30%. Two-hop accuracy drops faster still. See also Liu et al., Lost in the Middle.

2. Flashlight attention

Transformer attention is a finite resource. When an agent dumps a directory listing, ten files, and a stack trace into context, the relevant tokens are diluted by everything else. The model is staring into a haystack with a flashlight, not reading a map.

A graph query returns five edges. A grep returns five hundred lines, of which forty-eight are relevant. The same answer; an order of magnitude less attention burned.

3. Token cost & latency

A single impact-analysis question on a large polyglot monorepo costs $1.62 and takes 1.7 minutes when the agent does it by reading files. Substrate answers the same question deterministically in ~10 ms at $0.

Across a 9-question session, that's $14.59 and 15.7 minutes saved, every session.

Measured on a 1,455-file polyglot monorepo (745 Python, 491 Terraform, 189 TS, 30 JS).

Autonomous agents need a persistent substrate: soil they can root into 🌱, a living semantic knowledge graph that grows with the code, instead of grepping and reading files like a human would.

A graph is small, deterministic, and complete. The model can spend its attention on judgement, not on retrieval.

Live demo

The graph for a small travel-booking app.

Real index of examples/travel_platform/: a synthetic Python + TypeScript service. Every node is a function, class, file, or type. Every edge is a call, import, inheritance, or reference, extracted from the AST, no LLM. Drag, zoom, hover for signatures.

Benchmarks

Measured on a 1,455-file polyglot monorepo.

Same questions, same model (Claude Opus), same target codebase. One side has Substrate; the other reads files.

Structural queries — graph tools vs file reading

10,107× aggregate speedup 9 structural questions

$14.59 cost saved per session vs $0 with Substrate

93 msvs 15.7 min total wall-time 9 questions, end-to-end

147,914× peak speedup single architecture question

Coding tasks — Claude + Substrate vs Claude alone

Dry-run planning tasks (impact analysis, blast radius, dependency tracing). Same prompts, same model, scored against a frozen ground truth file set.

+6% accuracy (mean F1) 4 of 5 tasks improved

40% cheaper per session $4.51 vs $7.48 · 5 tasks

22% faster wall-time 372 s vs 476 s total

0.899 blast-radius F1 with graph vs 0.882 without · 100% recall

What it does

Eleven MCP tools your agent already knows how to call.

Tool	Use it when…
`impact_analysis`	"What breaks if I change / delete / rename X?"
`find_files_with`	"Which files declare or use X?" — blast radius by name, no node ID needed.
`trace_dependencies`	"Show me the call chain from A to B."
`trace_to_category`	"Trace every path from this function to an S3 / KMS / Lambda sink."
`search_text`	Token + string-literal search across every file — identifiers, env vars, config values.
`find_hotspots`	"What are the most critical functions?"
`find_dead_code`	"What code is never called?"
`find_circular_dependencies`	"Are there any import cycles?"
`find_docs`	Surface README, .env, workflow YAML, and config files by keyword.
`get_context`	"Tell me everything connected to function X."
`search_entities`	Fuzzy name search across files / functions / classes / types.
`get_architecture`	Component-level map: who depends on whom, top hotspots per component.

Quick start

Three commands and an `.mcp.json`.

uv sync                              # install deps
uv run substrate index .             # index the current repo
uv run substrate info .              # show graph stats
uv run substrate serve .             # start the MCP server (stdio)

Then point Claude Code at it. Drop this in .mcp.json at the repo root:

{
  "mcpServers": {
    "substrate": {
      "command": "uv",
      "args": ["run", "substrate", "serve", "."]
    }
  }
}

Claude Code auto-discovers .mcp.json on startup. Once connected, the eight tools above are available.