AI Agent Overview

How the agent reads your codebase, plans changes, edits files, and verifies its own work

The AI agent is the most important surface in Swarmz. Every other feature — Cloud, Git sync, deployments — exists so the agent can do its job. This page explains the mental model and the moving parts. Read it before you read anything else in this section.

The mental model

You can think of the agent as a junior engineer who joined your team this morning. It already knows the framework you're using, but it doesn't know your code yet. So before it touches anything, it does what a careful engineer would do.

Read the codebase

The agent starts by orienting itself. It pulls up a project map, looks at the file you're currently viewing, and greps for any symbols you mentioned. This happens automatically — you don't tell it to.

Plan

For anything non-trivial, it sketches an approach: which files to create, which to edit, what packages to install. On big changes, it shows you the plan first. See Plan Mode.

Write code

It writes new files or edits existing ones. Every write triggers Vite HMR — the preview iframe updates in milliseconds.

Verify

After changes, it can read console logs, capture a screenshot of the preview, and check the dev server. If something broke, it loops back to step 2 and fixes it.

Iterate

If the verify step turned up a problem, the agent doesn't stop — it diagnoses, edits, and re-verifies. Up to 40 internal turns per request before it gives up and reports back.

The streaming loop

Under the hood, your prompt becomes an HTTP POST to the /v1/ai/chat edge function with Accept: text/event-stream. The response is a long-lived SSE stream — the connection stays open while the agent works, and each event is a structured JSON message.

POST /functions/v1/ai-chat-v3
Content-Type: application/json
Accept: text/event-stream

{ "projectId": "...", "message": "add a login form", "viewingPath": "/" }

The events you'll see in the timeline:

Event	Meaning
`status`	Coarse progress: "Initializing...", "Waking workspace..."
`tool_start`	Agent called a tool — file read, grep, write
`tool_result`	Tool returned — duration and a result summary
`file_write`	A file was created or modified — frontend triggers a tree refresh
`text_delta`	Streaming text token from the model
`thinking_v5`	Pre-tool reasoning ("I need to check Settings.tsx first")
`checklist` / `checklist_update`	Plan steps and live progress
`snapshot_created`	Auto-snapshot before destructive edits
`compact_notice`	Context got too big — older turns were summarized
`usage` / `run_metrics`	Final tokens, cost, model, duration
`error`	Recoverable or fatal — frontend renders accordingly

The frontend hook useChatStreamV2 parses these events and updates the chat panel, file tree, and preview iframe in real time. If the connection stalls for 180 seconds, a watchdog cancels and shows a retry button.

Multi-model routing

You don't pick the model. The pipeline picks the right one for each phase, optimizing for latency and cost:

Phase	Model	Why
Plan	Claude Opus	High reasoning for ambiguous, multi-file changes
Implement	Claude Sonnet	Fast and accurate at code generation — does the bulk of the work
Search / scout	Claude Haiku	Cheap and quick for grep + file reads during orientation
Summarize	Claude Haiku	Compresses old turns when context fills up

Per-stage configuration lives in the model_stage_config table and can be swapped by an admin without redeploying. Most prompts only use Sonnet — the other models kick in for plan mode, scout passes, and compaction.

Knowledge presets and skills

The agent has two ways to load domain context beyond your codebase:

Knowledge presets are admin-curated topic packs (e.g. Stripe checkout, Supabase auth, shadcn theming). The agent calls knowledge_search and knowledge_get to pull just the relevant snippets. You attach presets per project from the editor's right rail.
Skills are user-configurable behavior toggles. They modify the system prompt — for example, a pixel-perfect-tailwind skill nudges the agent toward exact spacing values instead of approximations.

Both are loaded once per request and cached for 5 minutes. They never bloat the prompt unless the agent actively pulls them in.

Limits

The agent runs inside hard guardrails so it can't loop forever or burn through your credits:

Limit	Default	What it caps
`MAX_TURNS`	40	Total tool-call iterations per prompt
`MAX_TOTAL_INPUT`	200K tokens	Sum of all input tokens before force-finish
`MAX_FILE_READ_LINES`	300	Lines per `read_file` call
`MAX_GREP_RESULTS`	15	Results per `grep_search`
Watchdog	180s	Frontend cancels if no events arrive
Tool result cap	~50K chars	Max content the agent sees per tool call

When a budget is hit, the agent gets a system message — "Context budget reached. Call done immediately." — and wraps up with whatever it's done so far. You'll see a compact_notice event if it had to summarize older turns to keep going.

Limits are stored in the platform_limits table and can be tuned live via the admin panel. The values above are defaults — your workspace may have different ceilings.

Credit costs per action

Every prompt deducts credits based on actual token usage rather than a flat per-action fee. Rough action-type ranges:

Action	Typical credits
Simple edit (one file, small change)	1
Feature implementation (2-5 files)	3
Complex change (refactor, multiple components)	5
Spawning a sub-agent	2
Web search	1
Plan generation	2
New project scaffold	10

Deductions drain a 4-pool cascade: daily_bonus → included_credits (your plan) → rollover_credits (prior cycle, expires after rollover window) → topup_credits. See Credits for the full breakdown.

Where to next

Prompting Tips — how to actually get good output
Plan Mode — review changes before they happen
File Operations — every tool the agent can invoke

AI Agent Overview

On this page