Back

code-review-graph

·Also on Medium

When Claude Code reviews a pull request, it reads your entire codebase to understand context. Every function, every import, every file gets serialised into tokens. For anything beyond a small project, this is wasteful. Most of the code has nothing to do with the change being reviewed. I built code-review-graph to fix that.

The tool creates a persistent, local knowledge graph of your codebase. It uses tree-sitter for static analysis, parsing source files into syntax trees without needing compilation. From those trees it extracts every function definition, class, import statement, and call site, then stores the dependency relationships between them as a directed graph.

When a diff arrives, the system identifies which symbols changed and walks the graph outward, typically two or three hops. That walk produces a focused set of code that is genuinely relevant to the change. Only that set gets fed to Claude as context. Everything else is left out.

The graph sits in memory using adjacency lists and rarely needs a full rebuild. File saves trigger incremental updates. On real pull requests, this reduced token usage by 6.8x for code reviews and up to 49x for routine daily tasks. The savings are largest on big repositories, which is precisely where costs hurt most.

Integration works through the Model Context Protocol. code-review-graph registers as an MCP server, and Claude Code queries it whenever it needs codebase context. The graph decides what is relevant. The model does the reasoning. Each handles what it is good at.

A question I kept returning to was how many hops to traverse. Too few and you miss a function three calls deep where the bug actually lives. Too many and you are back to feeding in half the codebase. Two hops with a configurable override covers the vast majority of real changes.

The project is open source under an MIT licence and has picked up over five thousand stars on GitHub. That response suggests the problem resonates. If you use any LLM-based coding tool, the core idea applies: less context, more relevant context, better results.