Swarmz

AI Agent Overview

How the agent reads your codebase, plans changes, edits files, and verifies its own work

The AI agent is the most important surface in Swarmz. Every other feature — Cloud, Git sync, deployments — exists so the agent can do its job. This page explains the mental model and the moving parts. Read it before you read anything else in this section.

The mental model

You can think of the agent as a junior engineer who joined your team this morning. It already knows the framework you're using, but it doesn't know your code yet. So before it touches anything, it does what a careful engineer would do.

Read the codebase

The agent starts by orienting itself. It pulls up a project map, looks at the file you're currently viewing, and greps for any symbols you mentioned. This happens automatically — you don't tell it to.

Plan

For anything non-trivial, it sketches an approach: which files to create, which to edit, what packages to install. On big changes, it shows you the plan first. See Plan Mode.

Write code

It writes new files or edits existing ones. Every write triggers Vite HMR — the preview iframe updates in milliseconds.

Verify

After changes, it can read console logs, capture a screenshot of the preview, and check the dev server. If something broke, it loops back to step 2 and fixes it.

Iterate

If the verify step turned up a problem, the agent doesn't stop — it diagnoses, edits, and re-verifies. Up to 40 internal turns per request before it gives up and reports back.

The streaming loop

Under the hood, your prompt becomes an HTTP POST to the /v1/ai/chat edge function with Accept: text/event-stream. The response is a long-lived SSE stream — the connection stays open while the agent works, and each event is a structured JSON message.

POST /functions/v1/ai-chat-v3
Content-Type: application/json
Accept: text/event-stream

{ "projectId": "...", "message": "add a login form", "viewingPath": "/" }

The events you'll see in the timeline:

EventMeaning
statusCoarse progress: "Initializing...", "Waking workspace..."
tool_startAgent called a tool — file read, grep, write
tool_resultTool returned — duration and a result summary
file_writeA file was created or modified — frontend triggers a tree refresh
text_deltaStreaming text token from the model
thinking_v5Pre-tool reasoning ("I need to check Settings.tsx first")
checklist / checklist_updatePlan steps and live progress
snapshot_createdAuto-snapshot before destructive edits
compact_noticeContext got too big — older turns were summarized
usage / run_metricsFinal tokens, cost, model, duration
errorRecoverable or fatal — frontend renders accordingly

The frontend hook useChatStreamV2 parses these events and updates the chat panel, file tree, and preview iframe in real time. If the connection stalls for 180 seconds, a watchdog cancels and shows a retry button.

Multi-model routing

You don't pick the model. The pipeline picks the right one for each phase, optimizing for latency and cost:

PhaseModelWhy
PlanClaude OpusHigh reasoning for ambiguous, multi-file changes
ImplementClaude SonnetFast and accurate at code generation — does the bulk of the work
Search / scoutClaude HaikuCheap and quick for grep + file reads during orientation
SummarizeClaude HaikuCompresses old turns when context fills up

Per-stage configuration lives in the model_stage_config table and can be swapped by an admin without redeploying. Most prompts only use Sonnet — the other models kick in for plan mode, scout passes, and compaction.

Knowledge presets and skills

The agent has two ways to load domain context beyond your codebase:

  • Knowledge presets are admin-curated topic packs (e.g. Stripe checkout, Supabase auth, shadcn theming). The agent calls knowledge_search and knowledge_get to pull just the relevant snippets. You attach presets per project from the editor's right rail.
  • Skills are user-configurable behavior toggles. They modify the system prompt — for example, a pixel-perfect-tailwind skill nudges the agent toward exact spacing values instead of approximations.

Both are loaded once per request and cached for 5 minutes. They never bloat the prompt unless the agent actively pulls them in.

Limits

The agent runs inside hard guardrails so it can't loop forever or burn through your credits:

LimitDefaultWhat it caps
MAX_TURNS40Total tool-call iterations per prompt
MAX_TOTAL_INPUT200K tokensSum of all input tokens before force-finish
MAX_FILE_READ_LINES300Lines per read_file call
MAX_GREP_RESULTS15Results per grep_search
Watchdog180sFrontend cancels if no events arrive
Tool result cap~50K charsMax content the agent sees per tool call

When a budget is hit, the agent gets a system message — "Context budget reached. Call done immediately." — and wraps up with whatever it's done so far. You'll see a compact_notice event if it had to summarize older turns to keep going.

Limits are stored in the platform_limits table and can be tuned live via the admin panel. The values above are defaults — your workspace may have different ceilings.

Credit costs per action

Every prompt deducts credits based on actual token usage rather than a flat per-action fee. Rough action-type ranges:

ActionTypical credits
Simple edit (one file, small change)1
Feature implementation (2-5 files)3
Complex change (refactor, multiple components)5
Spawning a sub-agent2
Web search1
Plan generation2
New project scaffold10

Deductions drain a 4-pool cascade: daily_bonusincluded_credits (your plan) → rollover_credits (prior cycle, expires after rollover window) → topup_credits. See Credits for the full breakdown.

Where to next

On this page