prompt-vault
Git-native prompt version control CLI. Version, diff, test, and share LLM prompts as YAML files.
README
prompt-vault
Git-native prompt version control CLI
Why
| Pain Point | What Happens | prompt-vault Solution |
|---|---|---|
| Prompts live in app code | Changes are buried in diffs, no dedicated history | Dedicated .prompts/ directory with YAML files and full version history |
| No rollback | A bad edit means manually reverting from memory | Automatic versioning with one-command rollback |
| No testing | You find out a prompt is broken in production | Test suites with 6 assertion types, snapshot testing, and LLM provider integration |
How It Works
your-project/
.prompts/ <-- prompt-vault lives here
greeting.yaml <-- prompt definition (YAML)
greeting.test.yaml <-- test suite (optional)
.vault/
history.json <-- version history + tags
snapshots/ <-- test snapshots
bundles/ <-- packed bundles for sharing
Workflow
add edit render test
| | | |
v v v v
+---------+ +---------+ +-----------+ +------------+
| YAML | | Version | | Variable | | Assertions |
| Create | | Bump | | Substitut.| | + Snapshots|
+---------+ +---------+ +-----------+ +------------+
| | | |
+------+------+ | |
| | |
v v v
.vault/history.json RenderedPrompt TestResults
Features
| Category | Feature | Details |
|---|---|---|
| Storage | YAML-based prompts | Human-readable, git-friendly format |
| Versioning | Auto-increment versions | Every edit bumps the version automatically |
| History | Full audit trail | Track create, update, delete, rollback actions |
| Tagging | Named versions | Tag any version (e.g., production, stable) |
| Rendering | Variable substitution | {{variable}} syntax with defaults, escaping, and strict mode |
| Diffing | Line-level diffs | LCS-based diff with colored terminal output |
| Testing | 6 assertion types | contains, notContains, matches, notEmpty, maxTokens, maxLength |
| Snapshots | Baseline comparison | Detect output drift across LLM provider changes |
| Providers | Pluggable LLM providers | Fetch-based OpenAI and Anthropic (no SDK deps), mock provider for tests, extensible via TestProvider interface |
| Search | Full-text search | Search prompts by keyword across name, system, user, tags, and variables |
| Export | Multi-format export | Export prompts as JSON, TypeScript, or Markdown |
| Cloning | Prompt duplication | Clone any prompt under a new name with reset version |
| Comparing | Cross-prompt diff | Compare two different prompts side by side |
| Sharing | Pack/import bundles | Export prompts with history and tests as portable JSON bundles |
| Validation | Name and schema checks | Enforced naming rules and required field validation |
| CLI | 19 commands | Full workflow from init to stats, with end-to-end CLI workflow tests |
Quick Start
# Initialize a vault in your project
prompt-vault init
# Create a new prompt
prompt-vault add my-assistant --model gpt-4o
# Render with variables
prompt-vault render my-assistant --var input="Hello, world!"
# See what changed
prompt-vault diff my-assistant
Prompt YAML Schema
name: code-reviewer
model: gpt-4o
version: 1
tags:
- engineering
- review
variables:
- name: language
description: Programming language
default: TypeScript
- name: code
description: Code to review
system: |
You are a senior {{language}} code reviewer.
Focus on correctness, performance, and readability.
user: |
Review this code:
```{{language}}
{{code}}
**Fields:**
| Field | Type | Required | Description |
|---|---|---|---|
| `name` | string | Yes | Unique identifier (`/^[a-z][a-z0-9_-]{0,63}$/`) |
| `model` | string | No | Target LLM model |
| `version` | number | Auto | Auto-incremented on each edit |
| `tags` | string[] | No | Categorization tags |
| `variables` | PromptVariable[] | No | Variable definitions with optional defaults |
| `system` | string | * | System message template |
| `user` | string | * | User message template |
\* At least one of `system` or `user` is required.
## CLI Commands
| Command | Description | Example |
|---|---|---|
| `init` | Initialize `.prompts/` directory | `prompt-vault init` |
| `add <name>` | Create a skeleton YAML prompt | `prompt-vault add my-prompt --model gpt-4o` |
| `list` | List all prompts in the vault | `prompt-vault list` |
| `show <name>` | Display prompt contents as YAML | `prompt-vault show my-prompt` |
| `edit <name>` | Update prompt from file or stdin | `prompt-vault edit my-prompt --file updated.yaml` |
| `diff <name>` | Diff current vs previous version | `prompt-vault diff my-prompt --version 2` |
| `log <name>` | Show version history | `prompt-vault log my-prompt` |
| `tag <name> <tag>` | Tag the current version | `prompt-vault tag my-prompt production` |
| `rollback <name>` | Restore previous version | `prompt-vault rollback my-prompt` |
| `render <name>` | Render prompt with variable substitution | `prompt-vault render my-prompt --var key=value` |
| `test <name>` | Run test suite for a prompt | `prompt-vault test my-prompt --update-snapshots` |
| `pack <name>` | Package prompt as a portable bundle | `prompt-vault pack my-prompt` |
| `import <path>` | Import a prompt bundle | `prompt-vault import ./bundle.json --force` |
| `validate <name>` | Validate prompt schema and variables | `prompt-vault validate my-prompt` |
| `search <query>` | Search prompts by keyword | `prompt-vault search "assistant"` |
| `export <name>` | Export prompt in a specified format | `prompt-vault export my-prompt --format typescript` |
| `clone <source> <target>` | Clone a prompt under a new name | `prompt-vault clone my-prompt my-prompt-v2` |
| `compare <name1> <name2>` | Compare two different prompts | `prompt-vault compare prompt-a prompt-b` |
| `stats` | Show vault statistics | `prompt-vault stats` |
**Global option:** `--vault <path>` to specify a custom vault directory (default: `./.prompts`).
## Recording Modes / History
Every mutation (create, update, delete, rollback) is recorded in `.vault/history.json` with:
- **UUID** -- unique entry identifier
- **Version** -- auto-incremented per prompt
- **Action** -- what happened (`create`, `update`, `delete`, `rollback`)
- **Timestamp** -- ISO 8601
- **Content** -- full prompt snapshot at that version
- **Tags** -- user-assigned labels
```bash
# View history
prompt-vault log my-prompt
# Tag a version
prompt-vault tag my-prompt stable
# Roll back to previous version
prompt-vault rollback my-prompt
Variable Substitution
Variables use {{name}} syntax in system and user fields.
variables:
- name: language
default: Python
- name: task
description: What to do
system: "You are a {{language}} expert."
user: "{{task}}"
# Uses default for language, provides task
prompt-vault render my-prompt --var task="Write a sort function"
# Override default
prompt-vault render my-prompt --var language=Rust --var task="Write a sort function"
Strict mode: Pass { strict: true } to render() to throw a VariableError when any required variable is missing (no default and not provided). In the CLI, use prompt-vault render <name> --strict.
Escaping: Use \{{ and \}} to output literal double braces without substitution.
Validation: prompt-vault validate <name> checks for:
- Missing variables (declared without default, not provided)
- Undefined variables (used in template but not declared)
Diffing
Line-level diff powered by LCS (Longest Common Subsequence):
# Diff current vs previous version
prompt-vault diff my-prompt
# Diff current vs specific version
prompt-vault diff my-prompt --version 2
Output shows colored terminal diff with + (added), - (removed) prefixes across system, user, variables, and model sections.
Testing
Create a test file alongside your prompt:
# my-prompt.test.yaml
provider: openai
model: gpt-4o
cases:
- name: basic test
variables:
input: "Hello"
assertions:
- type: notEmpty
- type: contains
value: "hello"
- type: maxLength
value: 500
snapshot: baseline
6 assertion types:
| Type | Value | Description |
|---|---|---|
| contains | string | Output must contain the value |
| notContains | string | Output must NOT contain the value |
| matches | regex | Output must match the regex pattern |
| notEmpty | -- | Output must not be empty |
| maxTokens | number | Word count must not exceed value |
| maxLength | number | Character length must not exceed value |
Snapshots: Record a baseline output and detect drift on subsequent runs. Use --update-snapshots to refresh baselines.
Providers: Built-in openai and anthropic providers use the native fetch API -- no SDK dependencies required. Tests use a lightweight mock provider for deterministic, offline assertions. Implement the TestProvider interface to add custom providers.
Environment variables:
| Provider | Env Variable | Default Model |
|---|---|---|
| OpenAI | OPENAI_API_KEY | gpt-4o |
| Anthropic | ANTHROPIC_API_KEY | claude-sonnet-4-20250514 |
Pack & Import
Share prompts with full history and tests:
# Export a prompt as a bundle
prompt-vault pack my-prompt
# Creates: .prompts/.vault/bundles/my-prompt-v3.bundle.json
# Import on another machine
prompt-vault import ./my-prompt-v3.bundle.json
# Force overwrite if prompt already exists
prompt-vault import ./my-prompt-v3.bundle.json --force
Bundles include: prompt YAML, full version history, and test suite (if present).
Practical Tips
| Tip | Details |
|---|---|
| Commit .prompts/ to git | Prompts are plain YAML -- they diff and merge naturally |
| Tag before deploying | prompt-vault tag my-prompt production marks a known-good version |
| Use defaults for common vars | Reduces --var flags in everyday rendering |
| Write tests early | Catch regressions before they reach users |
| Pack before sharing | Bundles carry history, so recipients get full context |
| Validate in CI | prompt-vault validate <name> exits non-zero on issues |
API Reference
Core Classes
| Class | Constructor | Key Methods |
|---|---|---|
| PromptStore | new PromptStore(basePath, history?) | init(), add(data), get(name), list(), update(name, partial), remove(name), exists(name), validate(name), clone(source, target), getBasePath(), getStats() |
| HistoryManager | new HistoryManager(filePath) | load(), record(name, action, old, new), getHistory(name), getVersion(name, v), getLatestVersion(name), rollback(name), addTag(name, v, tag) (async), getByTag(name, tag) |
| PromptRenderer | new PromptRenderer() | render(prompt, variables, options?), validate(prompt, variables) |
| PromptDiffer | new PromptDiffer() | diff(a, b), format(diffResult) |
| TestRunner | new TestRunner(providers, store, renderer, vaultPath) | run(promptName, options?) |
| PromptRegistry | new PromptRegistry(registryPath) | pack(name, store, history), import(bundlePath, store, history, options?) |
Plugin System
| Interface / Class | Description |
|---|---|
| TestProvider | Interface: name, defaultModel, validateConfig(config), run(prompt, model) |
| ProviderRegistry | register(provider), get(name), list() |
| OpenAIProvider | Built-in provider for OpenAI (default model: gpt-4o) |
| AnthropicProvider | Built-in provider for Anthropic (default model: claude-sonnet-4-20250514) |
Utility Functions
| Function | Module | Description |
|---|---|---|
| validatePromptName(name) | core/validator | Throws InvalidPromptError if the name is invalid |
| validatePromptData(data) | core/validator | Throws InvalidPromptError if the schema is invalid |
| evaluateAssertion(output, assertion) | core/test-runner | Evaluates a single test assertion against output |
| searchPrompts(store, query) | core/search | Full-text search across all prompts (used by CLI) |
| exportToJSON(prompt) | core/exporter | Export a prompt as formatted JSON string |
| exportToTypeScript(prompt) | core/exporter | Export a prompt as a TypeScript const export |
| exportToMarkdown(prompt) | core/exporter | Export a prompt as a Markdown document |
Constants
| Constant | Value | Description |
|---|---|---|
| PROMPT_NAME_REGEX | /^[a-z][a-z0-9_-]{0,63}$/ | Valid prompt name pattern |
| MAX_PROMPT_NAME_LENGTH | 64 | Maximum name length |
Types
| Type | Description |
|---|---|
| PromptData | Core prompt structure (name, model, version, tags, variables, system, user) |
| PromptVariable | Variable definition (name, description?, default?) |
| HistoryEntry | Single history record (id, promptName, version, action, timestamp, content, tags) |
| HistoryFile | Container for history entries |
| RenderedPrompt | Rendered output (system, user) |
| ValidationIssues | Variable validation results (missing, unused, undefinedVars) |
| DiffLine | Single diff line (type: added/removed/unchanged, line) |
| PromptDiffResult | Full diff result (system, user, variables, model) |
| TestAssertion | Test assertion (type, value?) |
| TestCase | Test case (name, variables, assertions, snapshot?) |
| TestSuite | Test suite (provider, model?, cases) |
| TestCaseResult | Single test case result (name, passed, error?, output?, assertions) |
| TestRunResult | Full test run result (suiteName, passed, cases, duration) |
| TestResults | Aggregated results (total, passed, failed, runs) |
| BundleData | Portable bundle (name, version, exportedAt, prompt, history, tests?) |
| VaultStats | Vault statistics (promptCount, totalVersions, avgVersionsPerPrompt, variableUsage, tags) |
| ProviderRunResult | Provider response (output, tokenUsage, latency, model) |
| SearchResult | Search hit (promptName, matches: {field, line}[]) |
Error Classes
| Error | Thrown When |
|---|---|
| PromptNotFoundError | Prompt does not exist in the vault |
| PromptAlreadyExistsError | Prompt name is already taken |
| VaultNotInitializedError | .prompts/ directory does not exist |
| InvalidPromptError | Name or schema validation fails |
| VariableError | Required variables are missing |
| TestProviderError | Provider configuration or execution fails |
| VersionNotFoundError | Requested version does not exist |
| BundleError | Bundle file is invalid or unreadable |
License
MIT -- 2026 JSLEEKR