AI Agent Overview
How the agent reads your codebase, plans changes, edits files, and verifies its own work
The AI agent is the most important surface in Swarmz. Every other feature — Cloud, Git sync, deployments — exists so the agent can do its job. This page explains the mental model and the moving parts. Read it before you read anything else in this section.
The mental model
You can think of the agent as a junior engineer who joined your team this morning. It already knows the framework you're using, but it doesn't know your code yet. So before it touches anything, it does what a careful engineer would do.
Read the codebase
The agent starts by orienting itself. It pulls up a project map, looks at the file you're currently viewing, and greps for any symbols you mentioned. This happens automatically — you don't tell it to.
Plan
For anything non-trivial, it sketches an approach: which files to create, which to edit, what packages to install. On big changes, it shows you the plan first. See Plan Mode.
Write code
It writes new files or edits existing ones. Every write triggers Vite HMR — the preview iframe updates in milliseconds.
Verify
After changes, it can read console logs, capture a screenshot of the preview, and check the dev server. If something broke, it loops back to step 2 and fixes it.
Iterate
If the verify step turned up a problem, the agent doesn't stop — it diagnoses, edits, and re-verifies. Up to 40 internal turns per request before it gives up and reports back.
The streaming loop
Under the hood, your prompt becomes an HTTP POST to the /v1/ai/chat edge function with Accept: text/event-stream. The response is a long-lived SSE stream — the connection stays open while the agent works, and each event is a structured JSON message.
POST /functions/v1/ai-chat-v3
Content-Type: application/json
Accept: text/event-stream
{ "projectId": "...", "message": "add a login form", "viewingPath": "/" }The events you'll see in the timeline:
| Event | Meaning |
|---|---|
status | Coarse progress: "Initializing...", "Waking workspace..." |
tool_start | Agent called a tool — file read, grep, write |
tool_result | Tool returned — duration and a result summary |
file_write | A file was created or modified — frontend triggers a tree refresh |
text_delta | Streaming text token from the model |
thinking_v5 | Pre-tool reasoning ("I need to check Settings.tsx first") |
checklist / checklist_update | Plan steps and live progress |
snapshot_created | Auto-snapshot before destructive edits |
compact_notice | Context got too big — older turns were summarized |
usage / run_metrics | Final tokens, cost, model, duration |
error | Recoverable or fatal — frontend renders accordingly |
The frontend hook useChatStreamV2 parses these events and updates the chat panel, file tree, and preview iframe in real time. If the connection stalls for 180 seconds, a watchdog cancels and shows a retry button.
Multi-model routing
You don't pick the model. The pipeline picks the right one for each phase, optimizing for latency and cost:
| Phase | Model | Why |
|---|---|---|
| Plan | Claude Opus | High reasoning for ambiguous, multi-file changes |
| Implement | Claude Sonnet | Fast and accurate at code generation — does the bulk of the work |
| Search / scout | Claude Haiku | Cheap and quick for grep + file reads during orientation |
| Summarize | Claude Haiku | Compresses old turns when context fills up |
Per-stage configuration lives in the model_stage_config table and can be swapped by an admin without redeploying. Most prompts only use Sonnet — the other models kick in for plan mode, scout passes, and compaction.
Knowledge presets and skills
The agent has two ways to load domain context beyond your codebase:
- Knowledge presets are admin-curated topic packs (e.g. Stripe checkout, Supabase auth, shadcn theming). The agent calls
knowledge_searchandknowledge_getto pull just the relevant snippets. You attach presets per project from the editor's right rail. - Skills are user-configurable behavior toggles. They modify the system prompt — for example, a
pixel-perfect-tailwindskill nudges the agent toward exact spacing values instead of approximations.
Both are loaded once per request and cached for 5 minutes. They never bloat the prompt unless the agent actively pulls them in.
Limits
The agent runs inside hard guardrails so it can't loop forever or burn through your credits:
| Limit | Default | What it caps |
|---|---|---|
MAX_TURNS | 40 | Total tool-call iterations per prompt |
MAX_TOTAL_INPUT | 200K tokens | Sum of all input tokens before force-finish |
MAX_FILE_READ_LINES | 300 | Lines per read_file call |
MAX_GREP_RESULTS | 15 | Results per grep_search |
| Watchdog | 180s | Frontend cancels if no events arrive |
| Tool result cap | ~50K chars | Max content the agent sees per tool call |
When a budget is hit, the agent gets a system message — "Context budget reached. Call done immediately." — and wraps up with whatever it's done so far. You'll see a compact_notice event if it had to summarize older turns to keep going.
Limits are stored in the platform_limits table and can be tuned live via the admin panel. The values above are defaults — your workspace may have different ceilings.
Credit costs per action
Every prompt deducts credits based on actual token usage rather than a flat per-action fee. Rough action-type ranges:
| Action | Typical credits |
|---|---|
| Simple edit (one file, small change) | 1 |
| Feature implementation (2-5 files) | 3 |
| Complex change (refactor, multiple components) | 5 |
| Spawning a sub-agent | 2 |
| Web search | 1 |
| Plan generation | 2 |
| New project scaffold | 10 |
Deductions drain a 4-pool cascade: daily_bonus → included_credits (your plan) → rollover_credits (prior cycle, expires after rollover window) → topup_credits. See Credits for the full breakdown.
Where to next
- Prompting Tips — how to actually get good output
- Plan Mode — review changes before they happen
- File Operations — every tool the agent can invoke