← Blog

Graft v3.8: Estimator-Runtime Parity

4 min read
graftcompileradversarial-debatetech-debt

Graft is a domain-specific language for defining LLM agent pipelines — contexts, nodes, edges, and graphs compiled from .gft files into executable orchestration structures. It is built entirely through an adversarial debate harness where multiple AI agents independently analyze, cross-critique, and converge before any code is written.

What v3.8 Adds

v3.8 is a cleanup and parity version with three themes: flow-runner extraction, estimator alignment, and error message enrichment.

Flow-Runner Extraction

The case 'node' block in flow-runner.ts had grown to 66 lines — the densest block in the codebase, handling failure strategies, fallback aliasing, conditional chain routing, and depth limits. R1 extracted two functions:

  • applyFallbackAlias(name, result, ctx) — shared helper for fallback output aliasing
  • executeConditionalChain(flowNodeName, result, ctx, nodeResults, errors) — multi-hop routing logic

The block is now 9 lines. The largest single-round size reduction in the v3.x series.

Multi-Hop Conditional Chain Estimation

Since v3.7, the runtime follows conditional edge chains up to 10 hops. But the estimator only looked one hop ahead — best = cheapest branch, worst = most expensive branch. v3.8 aligns the estimator with a recursive getConditionalBranchCosts that walks chains, handles cycles (warning + finite cost), respects depth limits, and propagates retry multipliers through hops.

The key design insight came from A3-Skeptic: diamond-shaped graphs (A branches to B and C, both converge to D) require per-branch visited set copies. Without new Set(visited) per branch, the second path sees D as "already visited" and skips its cost. A3 scored this analysis at 6/10 confidence — lower than A2's 8/10 — but it contained the critical edge case.

Foreach Iteration Context

Error messages within foreach bodies now include (foreach iteration N of M) — three lines of code that identify exactly which iteration failed.

The 5-Retro Tech Debt

TD-03 (estimator-runtime parity for conditional chains) was first identified in the v3.3 retro. It was carried through v3.4, v3.5, v3.6, and v3.7 — five consecutive retros. The carry was justified: multi-hop routing itself (v3.7-R3) was a prerequisite for multi-hop estimation. You cannot align the estimator with runtime behavior that does not yet exist.

v3.8-R2 resolved it. The sequencing was correct but the justification was implicit — each retro simply noted "carried." This motivated R-PROC-19 (tech debt carry limit): items carried 3+ retros must state their prerequisite and expected resolution version.

In total, v3.8 closed three tech debt items: TD-02 (flow-runner density), TD-03 (estimator parity), and TD-04 (fallback alias duplication). The most tech debt closed in any v3.x version.

The Convergence Checklist

v3.7-R3 had a NEEDS_CHANGES verdict because three items from the convergence spec were absent from the implementation. Two process rules were proposed in response:

  • R-PROC-17: The implementer verifies each convergence requirement before submitting for review.
  • R-PROC-18: MEDIUM rounds introducing new error conditions must explicitly list error path tests in the convergence spec.

v3.8 was their first application. R2's convergence spec listed 4 error path tests (cycle estimation, depth limit, empty chain, warnings). All 4 were implemented. All 4 passed. Zero NEEDS_CHANGES across all 4 rounds — the first clean sweep since v3.5.

Forced Dissenter Highlights

A3-Skeptic's diamond path finding (R2) continues the pattern where the lower-confidence agent contributes the key insight. A2 proposed the recursive structure at score 8. A3 proposed per-branch visited set copies at score 6. The diamond edge case — invisible to the higher-confidence approach — was the difference between correct and incorrect estimation for branching conditional chains.

R-PROC-14 (wait for all analyses before convergence) ensured the finding was captured. Fifth application, fifth finding, zero latency cost.

Stats

| Metric | Value | |--------|-------| | Tests | 864 (32 new) | | Ratchet decisions | ~230 | | Rounds | 4 (2 DIRECT, 1 MEDIUM, 1 TEST-ONLY) | | Agent calls | ~10 | | NEEDS_CHANGES | 0 (first clean sweep since v3.5) | | Bugs caught (debate) | 1 (A3: diamond path edge case) | | Tech debt items closed | 3 (TD-02, TD-03, TD-04) | | Budget accuracy | Exact match (2nd consecutive version) |

Try It

npm install -g @graft-lang/graft
graft compile your-pipeline.gft
graft check your-pipeline.gft   # token estimation with multi-hop chain support
graft run your-pipeline.gft --input '{"query": "hello"}'

The estimator now walks conditional chains recursively — graft check produces accurate token estimates for pipelines with multi-hop conditional routing.

Source: github.com/JSLEEKR/graft

Built with Claude Opus 4.6 via Claude Code's adversarial debate harness. ~10 agent calls across 4 rounds.