Graft v2.2: LSP, npm, and the Last Mile

Graft is a graph-native DSL that compiles .gft source files into Claude Code harness structures (.claude/ directories with agent definitions, hook scripts, and orchestration docs). It is built for LLM-to-LLM communication — structured pipelines with typed schemas and compile-time token budget analysis. Every line of Graft is developed through a multi-agent adversarial debate process where 2-4 AI agents independently analyze, cross-critique, and converge on each design decision.

What v2.2 Adds

v2.2 bridges the gap between "compiler that works" and "compiler you can use." Three categories: internal cleanup, developer tooling, and distribution.

LSP Server

A Language Server Protocol server provides real-time feedback in editors. Two files, ~230 lines total:

Diagnostics: errors and warnings on document open/change, with structured error codes
Hover: context fields, node config (model, budget, reads, writes), memory details, produces schema
Go-to-definition: navigate to declarations, including cross-file imports

node Writer(model: sonnet, budget: 2k/1k) {
  reads: [Input, Log]    // hover on Log -> memory fields
  writes: [Log]          // go-to-definition -> Log declaration
  produces Result { reply: String }
}
// Ctrl+click on 'Log' jumps to the memory declaration, even in another file.

The LSP is built on pure functions — all feature handlers in features.ts take data in and return results out, with no side effects. The server maintains a per-URI cache so hover and go-to-definition still work when the current edit has syntax errors.

VS Code Extension

A TextMate grammar provides syntax highlighting for all 35 keywords, type keywords, domain types, and operators. The extension auto-launches graft-lsp when a .gft file is opened. Comment support covers both // line and /* */ block styles.

npm Distribution

The package is published as @graft-lang/graft with two entry points:

import { compile } from '@graft-lang/graft';       // compiler API
import { Program } from '@graft-lang/graft/ast';    // AST types only

Internal Cleanup (R1-R3)

Before the tooling work, three rounds addressed tech debt:

Double-parse eliminated: resolve() now accepts a Program instead of re-parsing source
ProgramIndex: O(1) Map-based lookups replacing Array.find() across the pipeline
Executor decomposed: split from ~475 lines into prompt-builder.ts (pure) + flow-runner.ts (flow) + slimmed executor.ts
Structured error codes: GraftErrorCode (21-member union) on all 34 diagnostic sites
5 new correctness warnings: foreach binding collision, conditional edge transform, multiple graph, and more

The Bug That Mattered

A3-Skeptic found the critical bug in Round 4. When compile() encounters a file with no graph declaration, it returned { success: false } without the parsed program. Library files — designed to be imported, not executed — have no graph. This meant the LSP could provide zero intelligence for library files: no hover, no go-to-definition, nothing.

The fix was a single line: include program in the error return object. But diagnosing it required understanding both the compiler pipeline (which treats no-graph as a hard error) and the LSP use case (which needs the program regardless). Three of four agents missed it entirely because they assumed compile() would always produce a usable program on parseable input. A3's analysis started from the question "what happens when I open shared.gft?" — a library file that exists in the example set since v2.0.

Debate Highlights

Error code taxonomy: A4-Specialist proposed 24 codes including speculative future ones. Convergence applied YAGNI, trimming to 18 codes for existing errors only. Three more were added in R3 when their corresponding warnings were implemented.

LSP file count: A4 proposed 5 files for the LSP. The other three agents agreed on 2 files (~230 lines total). When the entire feature fits in two screens, splitting it across five files adds navigation overhead without improving comprehension.

TextMate grammar corrections: A3-Skeptic caught three errors in the VS Code grammar proposal — # comments (Graft uses // and /* */), escape sequences (the lexer has none), and integer matching before k-integer (2k would be tokenized as 2 + k). Small catches, but each would have produced visible highlighting bugs for every user.

Process: Score-Gated Cross-Critique

Cross-critique was skipped in all 6 rounds — the longest streak since the harness was introduced. The rule: when all agents score within 1 point of each other (high consensus), the formal cross-critique phase is skipped. In v2.2, this saved an estimated 24 agent calls with no quality loss.

The key insight: cross-critique is most valuable when agents disagree fundamentally. When consensus is high, the disagreements are already clearly stated in the analysis phase and resolved in convergence without an extra round.

v2.2 also introduced a TEST-ONLY tier for Round 6 (integration tests, no design decisions), using just 2 agent calls instead of the usual 5-11. The harness now operates at three tiers: HIGH (~11 calls), MEDIUM (~5 calls), and TEST-ONLY (~2 calls).

Stats

| Metric | v2.1 | v2.2 | |--------|------|------| | Tests | 288 | 376 (+88) | | Ratchet decisions | 107 | 132 (+25) | | Agent calls | 21 | ~33 | | Rounds | 4 | 6 | | NEEDS_CHANGES | 0 | 0 | | Critical bugs found | 1 | 1 | | Cross-critique used | 0 of 4 | 0 of 6 | | New production files | 3 | 8 |

Consecutive rounds with first-attempt review pass: 15 (since v2.0-R1).

Try It

git clone https://github.com/JSLEEKR/Graft.git
cd Graft
npm install
npm run build

# Compile a .gft file
npx graft compile examples/chatbot.gft

# Run a pipeline
npx graft run examples/chatbot.gft --input '{"message": "hello"}'

# Check without compiling
npx graft check examples/chatbot.gft

# Start the LSP server (for editor integration)
npx graft-lsp --stdio

# Run tests
npm test  # 376 tests

The VS Code extension is in editors/vscode/. Install it locally with code --install-extension after building.

Built with Claude Opus 4.6 via the adversarial debate harness. ~33 agent calls, 132 ratchet-locked decisions, zero review failures across 15 consecutive rounds.