// programmable AI agent

The AI agent you actually control.

We tried every coding agent we could find. They all hide the loop. You write a prompt and the vendor decides everything else, from which model runs to what happens when it fails. So we built Knockout. The loop is yours.

~/my-saas-app, knockout

// your first run

What happens in your first session

0 to 10 seconds

You open a terminal in a real project and type one line of intent. ko "add OAuth with Google"

10 to 60 seconds

Three models debate the approach, then the best coder writes the code. A fast model runs the tests, and if anything fails the pipeline loops back automatically.

After

The diff is on your filesystem. The running cost is in the status bar (usually under fifty cents). Memory is written before the session ends, so the next prompt starts where this one left off.

The tutorial walks one full feature end-to-end. Five minutes, free pooled credits or bring your own keys.

// 01 · architecture

The pipeline is the product

Knockout exposes the agent loop. Every step in the pipeline has its own model, its own tool allowlist, its own retry budget and a transition rule that fires when it ends, and you can swap any of them without touching the others. You design it in a drag-and-drop graph, write it as JSON, or use the card view. The loop is yours, not the vendor's.

pipeline.json, KO_STANDARD_MULTILLM
PLAN
3 models from 3 vendors debate your approach. Consensus required.
claude-sonnet-4.6 + gpt-5 + gemini-2.5-pro · strategy: consensus
ACTION
Best coder implements the plan. Full file access + shell.
claude-sonnet-4.6 · tools: Bash, Read, Write, Edit, Glob, Grep
VERIFY
Fast model runs tests, checks quality. Fails? Loop back to ACTION.
gemini-2.5-flash · tools: Bash, Read, Grep (read-only) · on_fail → ACTION
Add steps, change models, set budgets and tool allowlists per step, attach pre and post hooks. Visually.
// 02 · versatility

Same CLI. Different team.

The same ko command runs a different pipeline and behaves like a different expert. 9 pipelines ship today, all coding-focused. Legal, finance, business, research, HR and creative pipelines are in design.

TDD developer

KO_TDD

WriteTest, Implement, Refactor, Verify. Tests first, code second.

$ ko "Add OAuth with Google and GitHub"

Security auditor

KO_SECURE

Plan (consensus), Implement, SecurityAudit, Test, Verify.

$ ko "Add session auth, harden it"

Consensus builder

KO_consensus

3 frontier models reason in parallel, a judge picks the winner.

$ ko "Redesign the cache layer for 10x reads"

Code reviewer

KO_review

Read-only Analyze, Review, Report. Never edits a file.

$ ko "Review the diff before I push"
// 03 · model routing

The right model for each step

A frontier model on every step is a waste of money. Cheap models can read a codebase. Fast models can run tests. Save the expensive brain for the work that needs it. Knockout lets you assign a model to every step in the pipeline.

Cheap model: research

Cheap models read the codebase and gather context. Gemini Flash, DeepSeek V3. Pennies per task and they handle this fine.

Gemini 2.5 Flash, $0.30/M input

Powerful model: execute

Frontier models do the actual work. Writing the code that ships, reasoning about hard tradeoffs. The expensive brain earns its keep here.

Claude Sonnet 4, $3/M input

Fast model: verify

Quick models run your tests and check the output. If something fails, the pipeline loops back automatically. Latency under a second.

GPT-4o mini, $0.15/M input

Typical run costs about 20% of what it would on a frontier-only setup. Quality holds because cheap models only get the work they can handle.

// 04 · in action

Same session, new task

Knockout remembers the codebase and picks up where it left off.

~/my-saas-app, knockout
// 05 · memory

Memory that survives

Every other agentic CLI forgets you between sessions. Knockout indexes your repo, remembers your prior work, keeps long sessions sharp and gives you a forensic record you can actually search.

Workspace memory

ko workspace init

ko workspace init takes about 30 seconds on a typical repo. After that the agent finds files by concept, not just by symbol name. Ask it about "the auth flow" and it returns the right handlers even when none of them contain the word "auth".

An fs-watcher debounces changes and re-embeds on the fly, so the index stays current as you edit. Cursor indexes too, but it grep-shapes the query. We embed it. That's why concept queries land.

Cross-session memory

Turn it on once with ko memory enable and every session feeds the next. The summary lives server-side so it follows you to whichever laptop you opened today.

Cursor, Claude Code, Cline, Aider. They all start each session at zero. ko gets sharper the more you use it.

Long sessions stay sharp

Any tool output above 8 KB gets offloaded to a per-session vector store and replaced with a short stub the agent can re-retrieve on demand. Context stops compounding. Quality stops sliding three hours in.

You don't configure it, always on, gone in 24 hours.

Every action is searchable

Every tool call from every session is indexed. ko audit search "prod_secrets" answers "did the agent really not touch that file last Tuesday" in one query. Retention defaults to 90 days. Bump it to 7 years per scope when compliance asks.

Other CLIs give you a transcript. This is the answer to the first enterprise question about agentic dev.

Scopes

Tag sessions with whatever axes you actually care about. Project, release, customer, severity. Settings cascade per scope, so prod gets long retention and a sensitive customer gets --scope-widening-locked=true, a server-enforced wall the agent can't talk its way around.

Searches respect scope filters too. No more grepping all your projects to find one staging incident.

// 06 · comparison

Why people switch

BEFORE

One model, one approach. Hope it works.

AFTER

19 providers, dozens of models. Assign one per step.

BEFORE

Black-box agent. You don't know what it's doing.

AFTER

Visual pipeline. Every step, every model, every transition rule.

BEFORE

Pay $20-200/mo for tools you use 3 times a week.

AFTER

Pay per token. A full feature usually lands under $0.50. $0 when idle.

BEFORE

Your code goes through their servers. Just trust them.

AFTER

Encrypted with a password we never see. The server stores opaque blobs.

BEFORE

Works in one IDE. Locked in.

AFTER

CLI in any terminal, VS Code, JetBrains, Cursor, Windsurf, any MCP client.

BEFORE

Memory resets every session.

AFTER

Memory persists across sessions and syncs across your devices.

// 07 · templates

65 project templates

A wizard asks you 5 to 10 questions about your business and audience, then builds a project that fits. Not rigid starters.

10
Business
Corporate, Agency, Law firm
10
E-Commerce
Clothing, Crafts, Electronics
8
SaaS
Analytics, CRM, Project mgmt
6
Blogs
Tech, Travel, Podcast hub
6
Community
Forum, Events, Q&A
6
Education
Courses, Quiz, Tutoring
5
Health
Fitness, Nutrition, Meditation
4
Productivity
Kanban, Habits, Expense
5
Creative
Photo, Music, Design
5
Food
Restaurant, Cafe, Delivery
// 08 · capabilities

Built for power users

12 agent tools

Bash, file I/O, web search, web fetch, git diffs, clipboard, Docker, SQL queries. Everything you'd reach for in a real terminal.

Encrypted sync

Move projects between machines. The server stores opaque blobs encrypted with a password we never see.

Persistent memory

Global and project-level memory survives across sessions. Syncs across your laptop, desktop and phone, end-to-end encrypted.

Skills, with Skill Seekers

Three skills ship pre-installed (Superpowers, Context7, Security Guidance). Skill Seekers turns any docs into a working skill in minutes.

Works in your editor

CLI in any terminal, VS Code, JetBrains, Cursor, Windsurf, anything that speaks MCP. No vendor lock-in.

Permission control

Interactive prompts for every action. Auto-allow the ops you've already approved. Full control over what the agent can touch.

Scoped tokens, audited access

Per-user permissions. Scoped API tokens. OAuth plus headless auth. No dangerously-skip-permissions flag.

19 AI providers

OpenAI, Anthropic, Google, DeepSeek, Mistral, xAI, Cohere, Groq, Together, Fireworks. Bring your own keys or use pooled credits.

// 09 · vs. the rest

Knockout vs. everything else

FeatureKnockoutClaude CodeCursorCopilot
AI models19 providers, any modelAnthropic onlyLimitedOpenAI only
You control the loopYes, visual pipeline editorNoNoNo
Multi-model per taskYes: consensus, judge, chain, fallbackNoNoNo
Vertical pipelines (in design)Legal, finance, HR, researchNoNoNo
Encrypted code storageYes, password we never seeNoNoNo
Cross-machine syncYes, encryptedNoNoNo
Skills3 shipped, Skill Seekers builds morePluginsLimitedExtensions
PricingPay per token$20/mo$20/mo$10-19/mo
Editor supportCLI + VS Code + any MCP clientCLICursor onlyVS Code/JB
// 10 · pricing

Honest pricing

Pay per token. You see the cost of every call before it runs and the running total in the status bar.

$0.01-0.05
Quick fix
Bug fix, rename, add a field
$0.05-0.50
Full feature
Auth, billing, API, CRUD
$0.50-2.00
Bespoke project
Full app from template wizard

The economy pipeline routes everything through budget models. The consensus pipeline routes three frontier models in parallel. Pick the pipeline that matches what the work actually needs.

// 11 · honest answers

Questions we get asked

Does my code go through your servers?

The agent runs in your terminal, on your filesystem. The vault password lives only on your machine and the platform stores opaque encrypted blobs. We never see your code or your provider keys.

What if my favorite model isn't supported?

We wire 19 providers today, including OpenAI, Anthropic, Google, Mistral, Groq, Fireworks, Together, Cerebras, OpenRouter and Ollama for local models. The OpenAI-compatible adapter usually gets you to the rest. Tell us if you hit one that doesn't.

What does a real session cost?

A quick fix usually runs under five cents. A full feature usually runs under a dollar. You see the running total in the status bar and the projection before any expensive call. Set a hard cap per pipeline step.

Subscription?

No subscription. Top up with Stripe and pay raw provider cost, or bring your own keys and pay the providers directly. We never charge you when ko is idle.

Design the agent you wish you had

You design the pipeline. The agent runs it.

Your pipeline, your models, your tools and your rules. Whichever editor you already use.

No subscription, no credit card on signup. You pay only for what your models actually burn.