Building rtk from Scratch in Go — What I Learned

What is rtk?

If you use LLMs as coding assistants, you have probably noticed how much of your token budget gets wasted on irrelevant output. Run git status in a large repo and the model receives hundreds of tokens of noise -- untracked files, branch tracking info, hints about how to unstage -- when all it really needs is "3 files modified." Run git diff and it gets the full patch including unchanged context lines, file permission metadata, and index hashes that mean nothing to the model.

rtk solves this. It is a CLI proxy that sits between your shell and your LLM agent, intercepting command output and compressing it before the model sees it. Instead of git status dumping 84 tokens, rtk filters it down to 19. Instead of a test runner spewing 500 lines of progress dots, rtk extracts just the failures.

The project has 14,600+ stars on GitHub. It is written in Rust, ships as a single binary, and supports over 45 different CLI tools. It is genuinely useful.

So naturally, I decided to rewrite it from scratch.

Why Reimplement It?

Three reasons.

First, understanding. Reading someone else's code teaches you what they built. Rebuilding it teaches you why. I wanted to understand the filtering algorithms deeply -- not just what regex patterns they use, but what design decisions led to those patterns and whether different trade-offs were possible.

Second, simplicity. rtk's Rust codebase has 72 modules and 22 crate dependencies. That is a lot of surface area for a tool whose core job is "read text, remove some lines, output the rest." I wanted to see if the same value could be delivered with radically less code.

Third, security. While studying the source, I found that rtk uses sh -c for command execution in some code paths. This means user input flows through a shell interpreter, which is a textbook shell injection vulnerability. I wanted to build a version where that entire class of bug is impossible by design.

What I Found Studying the Original

I spent a full day reading rtk's Rust source before writing any Go code. Here is what stood out.

72 modules with no shared interface. Each filter -- git status, git diff, cargo test, pytest, grep, find, docker, kubectl, and dozens more -- is its own Rust file with an ad-hoc run() function. There is no Filter trait. No common contract for what a filter receives and returns. Each module independently reimplements similar patterns: regex line matching, output grouping, truncation logic, line counting. This makes the codebase hard to navigate and harder to extend.

Shell injection vulnerability. Issue #640 in the rtk repository documents this. Some execution paths construct a shell command string and pass it to sh -c, which means specially crafted filenames or arguments could execute arbitrary code. For a tool that wraps arbitrary CLI commands, this is a serious oversight.

22 crate dependencies. The dependency tree includes crates for TOML parsing, SQLite, terminal detection, regex, directory walking, HTTP requests (for telemetry), and more. Each dependency is a supply chain risk and a compilation cost. The question is whether all of them are necessary.

Smart filtering strategies. Despite the architectural issues, the actual filtering logic is clever. The git status filter parses porcelain output to extract just the summary. The diff filter caps output at 100 lines per file and adds recovery hints. The test runner filters use state machines to extract only failures. These are well-thought-out algorithms.

Our Approach

With the analysis complete, I set three constraints for the Go reimplementation:

Unified Filter interface -- every filter implements Name(), Match(), and Apply()
Zero external dependencies -- Go stdlib only
exec.Command only -- no shell invocation, ever

The result is rtk-go, which ships 11 filters covering git (status, diff, log), search tools (grep, ripgrep, find, ls), test runners (go test, pytest, npm/jest/vitest), and build tools (go, cargo, make, npm, tsc).

The unified Filter interface turned out to be the single biggest architectural win. In rtk, adding a new filter means creating a new Rust file, writing a standalone function, and modifying the main command routing. In rtk-go, you implement three methods and register the filter. The interface enforces consistency -- every filter must declare what commands it matches and how it transforms output.

For command execution, exec.Command with explicit argument arrays is the only code path. There is no function in the entire codebase that constructs a shell command string. Shell injection is not mitigated or guarded against -- it is structurally impossible.

The zero-dependency constraint forced some interesting decisions. Instead of importing a YAML library, I wrote a minimal key-value parser for the config file format. Instead of using a SQLite crate for persistent token tracking, I kept it in-memory per session. These are real trade-offs -- you lose some functionality -- but you gain a codebase that is fully auditable and has zero supply chain risk.

The project includes 156 tests and passed 3 consecutive clean adversarial evaluation cycles (functionality, edge cases, documentation, security, and competitiveness reviews with zero bugs found across all rounds).

Performance Comparison

Here is where things got interesting. I benchmarked both tools on the same machine (Windows 11), same repository, same commands.

Binary Size

| | rtk (Rust) | rtk-go (Go) | |--|-----------|-------------| | Size | 7.9 MB | 3.9 MB | | Dependencies | 22 crates | 0 |

Execution Speed and Token Reduction

| Command | Raw Tokens | rtk Tokens | rtk Time | rtk-go Tokens | rtk-go Time | |---------|-----------|-----------|----------|--------------|-------------| | git status | 84 | 19 (77% saved) | 539ms | 15 (82% saved) | 440ms | | git log --oneline -20 | 364 | 364 (passthrough) | 491ms | 364 (passthrough) | 439ms | | git diff HEAD~1 | 2367 | 2297 (3% saved) | 552ms | 2375 (passthrough) | 438ms | | find -name "*.md" | ~33 | ~16 (52% saved) | 402ms | ~33 (passthrough) | 455ms |

The Go version is ~20% faster on average (440ms vs 520ms). This is likely due to Go's faster cold start time compared to Rust. Go binaries initialize their runtime quickly; Rust binaries, especially those with many crate dependencies, have more initialization overhead.

The Go binary is 2x smaller (3.9MB vs 7.9MB), entirely because of the zero-dependency approach.

Token reduction is a mixed story. rtk has more aggressive filtering for find and diff output, saving 52% and 3% respectively where rtk-go passes through. But rtk-go actually beats rtk on git status (82% vs 77% savings). For commands where neither tool applies filtering (like git log --oneline), both correctly pass through unchanged.

On the security front: rtk-go uses exec.Command with explicit argument arrays. rtk uses sh -c in some paths. This is not a performance difference -- it is a vulnerability class that one tool has and the other does not.

What I Learned

The 80/20 Rule Applies to Filters

rtk supports 72 filter modules covering everything from git status to aws cli to prisma. But the top 10 filters -- git commands, grep, find, and the major test runners -- cover 95%+ of real-world usage. The remaining 62 modules serve increasingly niche use cases (how often does your LLM agent run psql or kubectl?).

This is a classic case where breadth of coverage creates maintenance burden without proportional value. rtk-go's 11 filters handle the commands that developers actually use daily, and the unified interface makes it trivial to add more as needed.

Rust's Complexity Was Not Justified Here

This is not a "Rust bad, Go good" argument. Rust is the right choice for performance-critical systems code, memory-unsafe environments, and projects where zero-cost abstractions matter. But a CLI output filter is none of those things. The program reads stdin, applies some regex transformations, and writes to stdout. It does not manage memory, handle concurrency, or require fearless parallelism.

Go's simplicity -- goroutines you do not need, a GC you do not notice, and a stdlib that covers 90% of use cases -- is a better fit for this problem domain. The benchmark confirms it: Go is faster here, not despite being garbage-collected, but because the problem does not stress the areas where Rust excels.

Shell Injection is a Design Choice

rtk's shell injection vulnerability is not a bug in the traditional sense. It is a consequence of choosing sh -c as the execution mechanism for convenience. When you wrap arbitrary user commands, the easy path is to concatenate them into a shell string. The safe path is to parse the command into an executable and argument array and use exec.Command (or std::process::Command with explicit args).

The lesson is that security constraints should shape architecture from the start. Making exec.Command the only execution path in rtk-go was not an afterthought -- it was a design constraint that informed every decision about how commands are parsed and dispatched.

Token Counting via chars/4 is Surprisingly Accurate

Both projects estimate token count as len(output) / 4. This is a rough heuristic -- real tokenizers like tiktoken produce different counts depending on the text. But for the purpose of showing "you saved X% of tokens," it is good enough. The relative savings percentage is what matters, and chars/4 tracks real token counts closely enough for English text.

Importing a tokenizer library would break the zero-dependency constraint and add startup latency for marginal accuracy improvement. Sometimes good enough really is good enough.

The Hook System is the Real Product

The biggest gap in rtk-go is not any specific filter -- it is rtk's hook system. rtk can integrate transparently with LLM agents by rewriting commands via PreToolUse hooks before the model sees them. The user never has to type rtk as a prefix; the tool inserts itself automatically.

Without hooks, rtk-go requires manual prefixing of every command. This is the difference between a tool that developers adopt and a tool that developers try once and forget. It is the most important feature to add next.

Try It

The code is at github.com/JSLEEKR/rtk-go. Install with:

go install github.com/JSLEEKR/rtk-go@latest

Or build from source:

git clone https://github.com/JSLEEKR/rtk-go.git
cd rtk-go
go build -o rtk-go ./cmd/rtk-go

It is a single binary, zero dependencies, and it works on Linux, macOS, and Windows. If you use LLMs for coding and want to cut your token costs, give it a try.