// programmable AI agent
The AI agent you actually control.
We tried every coding agent we could find. They all hide the loop. You write a prompt and the vendor decides everything else, from which model runs to what happens when it fails. So we built Knockout. The loop is yours.
// your first run
You open a terminal in a real project and type one line of intent. ko "add OAuth with Google"
Three models debate the approach, then the best coder writes the code. A fast model runs the tests, and if anything fails the pipeline loops back automatically.
The diff is on your filesystem. The running cost is in the status bar (usually under fifty cents). Memory is written before the session ends, so the next prompt starts where this one left off.
The tutorial walks one full feature end-to-end. Five minutes, free pooled credits or bring your own keys.
Knockout exposes the agent loop. Every step in the pipeline has its own model, its own tool allowlist, its own retry budget and a transition rule that fires when it ends, and you can swap any of them without touching the others. You design it in a drag-and-drop graph, write it as JSON, or use the card view. The loop is yours, not the vendor's.
The same ko command runs a different pipeline and behaves like a different expert. 9 pipelines ship today, all coding-focused. Legal, finance, business, research, HR and creative pipelines are in design.
KO_TDD
WriteTest, Implement, Refactor, Verify. Tests first, code second.
KO_SECURE
Plan (consensus), Implement, SecurityAudit, Test, Verify.
KO_consensus
3 frontier models reason in parallel, a judge picks the winner.
KO_review
Read-only Analyze, Review, Report. Never edits a file.
A frontier model on every step is a waste of money. Cheap models can read a codebase. Fast models can run tests. Save the expensive brain for the work that needs it. Knockout lets you assign a model to every step in the pipeline.
Cheap models read the codebase and gather context. Gemini Flash, DeepSeek V3. Pennies per task and they handle this fine.
Gemini 2.5 Flash, $0.30/M inputFrontier models do the actual work. Writing the code that ships, reasoning about hard tradeoffs. The expensive brain earns its keep here.
Claude Sonnet 4, $3/M inputQuick models run your tests and check the output. If something fails, the pipeline loops back automatically. Latency under a second.
GPT-4o mini, $0.15/M inputTypical run costs about 20% of what it would on a frontier-only setup. Quality holds because cheap models only get the work they can handle.
Knockout remembers the codebase and picks up where it left off.
Every other agentic CLI forgets you between sessions. Knockout indexes your repo, remembers your prior work, keeps long sessions sharp and gives you a forensic record you can actually search.
ko workspace init takes about 30 seconds on a typical repo. After that the agent finds files by concept, not just by symbol name. Ask it about "the auth flow" and it returns the right handlers even when none of them contain the word "auth".
An fs-watcher debounces changes and re-embeds on the fly, so the index stays current as you edit. Cursor indexes too, but it grep-shapes the query. We embed it. That's why concept queries land.
Turn it on once with ko memory enable and every session feeds the next. The summary lives server-side so it follows you to whichever laptop you opened today.
Cursor, Claude Code, Cline, Aider. They all start each session at zero. ko gets sharper the more you use it.
Any tool output above 8 KB gets offloaded to a per-session vector store and replaced with a short stub the agent can re-retrieve on demand. Context stops compounding. Quality stops sliding three hours in.
You don't configure it, always on, gone in 24 hours.
Every tool call from every session is indexed. ko audit search "prod_secrets" answers "did the agent really not touch that file last Tuesday" in one query. Retention defaults to 90 days. Bump it to 7 years per scope when compliance asks.
Other CLIs give you a transcript. This is the answer to the first enterprise question about agentic dev.
Tag sessions with whatever axes you actually care about. Project, release, customer, severity. Settings cascade per scope, so prod gets long retention and a sensitive customer gets --scope-widening-locked=true, a server-enforced wall the agent can't talk its way around.
Searches respect scope filters too. No more grepping all your projects to find one staging incident.
One model, one approach. Hope it works.
19 providers, dozens of models. Assign one per step.
Black-box agent. You don't know what it's doing.
Visual pipeline. Every step, every model, every transition rule.
Pay $20-200/mo for tools you use 3 times a week.
Pay per token. A full feature usually lands under $0.50. $0 when idle.
Your code goes through their servers. Just trust them.
Encrypted with a password we never see. The server stores opaque blobs.
Works in one IDE. Locked in.
CLI in any terminal, VS Code, JetBrains, Cursor, Windsurf, any MCP client.
Memory resets every session.
Memory persists across sessions and syncs across your devices.
A wizard asks you 5 to 10 questions about your business and audience, then builds a project that fits. Not rigid starters.
Bash, file I/O, web search, web fetch, git diffs, clipboard, Docker, SQL queries. Everything you'd reach for in a real terminal.
Move projects between machines. The server stores opaque blobs encrypted with a password we never see.
Global and project-level memory survives across sessions. Syncs across your laptop, desktop and phone, end-to-end encrypted.
Three skills ship pre-installed (Superpowers, Context7, Security Guidance). Skill Seekers turns any docs into a working skill in minutes.
CLI in any terminal, VS Code, JetBrains, Cursor, Windsurf, anything that speaks MCP. No vendor lock-in.
Interactive prompts for every action. Auto-allow the ops you've already approved. Full control over what the agent can touch.
Per-user permissions. Scoped API tokens. OAuth plus headless auth. No dangerously-skip-permissions flag.
OpenAI, Anthropic, Google, DeepSeek, Mistral, xAI, Cohere, Groq, Together, Fireworks. Bring your own keys or use pooled credits.
| Feature | Knockout | Claude Code | Cursor | Copilot |
|---|---|---|---|---|
| AI models | 19 providers, any model | Anthropic only | Limited | OpenAI only |
| You control the loop | Yes, visual pipeline editor | No | No | No |
| Multi-model per task | Yes: consensus, judge, chain, fallback | No | No | No |
| Vertical pipelines (in design) | Legal, finance, HR, research | No | No | No |
| Encrypted code storage | Yes, password we never see | No | No | No |
| Cross-machine sync | Yes, encrypted | No | No | No |
| Skills | 3 shipped, Skill Seekers builds more | Plugins | Limited | Extensions |
| Pricing | Pay per token | $20/mo | $20/mo | $10-19/mo |
| Editor support | CLI + VS Code + any MCP client | CLI | Cursor only | VS Code/JB |
Pay per token. You see the cost of every call before it runs and the running total in the status bar.
The economy pipeline routes everything through budget models. The consensus pipeline routes three frontier models in parallel. Pick the pipeline that matches what the work actually needs.
Does my code go through your servers?
The agent runs in your terminal, on your filesystem. The vault password lives only on your machine and the platform stores opaque encrypted blobs. We never see your code or your provider keys.
What if my favorite model isn't supported?
We wire 19 providers today, including OpenAI, Anthropic, Google, Mistral, Groq, Fireworks, Together, Cerebras, OpenRouter and Ollama for local models. The OpenAI-compatible adapter usually gets you to the rest. Tell us if you hit one that doesn't.
What does a real session cost?
A quick fix usually runs under five cents. A full feature usually runs under a dollar. You see the running total in the status bar and the projection before any expensive call. Set a hard cap per pipeline step.
Subscription?
No subscription. Top up with Stripe and pay raw provider cost, or bring your own keys and pay the providers directly. We never charge you when ko is idle.
You design the pipeline. The agent runs it.
Your pipeline, your models, your tools and your rules. Whichever editor you already use.
No subscription, no credit card on signup. You pay only for what your models actually burn.