Overview
The Blink Server acts as the control plane for AI agent deployments. Key architectural principles:- Agents are HTTP servers deployed as Docker containers
- The control loop runs inside the server, not in agents
- Communication is HTTP; chat streaming uses SSE and WebSocket to clients
- State is centralized in PostgreSQL (chats, runs, deployments, files, logs, traces, KV)
Server Components
- API server handles chats, agents, webhooks, files, logs, traces, and devhook routing
- WebSocket server is used for chat streaming and auth token handshakes
- Startup runs database migrations before accepting traffic
Agent Execution Model
Agents are deployed as Docker containers using a configurable image (default:ghcr.io/coder/blink-agent:latest).
Container Structure
Each agent container includes:| Component | Purpose |
|---|---|
| Agent bundle | Built files staged into /app from the deployment output |
| Runtime wrapper | Starts the agent and internal API server, proxies requests, injects auth |
| Internal API server | Serves /kv, /chat, and /otlp/v1/traces for agent code and forwards to the Blink Server |
| OpenTelemetry Collector | Collects agent logs and forwards them to the server |
Runtime Wiring (self-hosted)
On deployment the server:- Downloads deployment output files, writes them to a temp dir, and adds a runtime wrapper (
__wrapper.js). - Launches a container and sets environment variables like
ENTRYPOINT,PORT,INTERNAL_BLINK_API_SERVER_URL,INTERNAL_BLINK_API_SERVER_LISTEN_PORT,BLINK_REQUEST_URL,BLINK_REQUEST_ID, andBLINK_DEPLOYMENT_TOKEN. - The wrapper starts an internal API server inside the container and patches
fetchso internal API calls includex-blink-internal-auth. - The wrapper runs the agent entrypoint on
PORT+1and proxies incoming requests onPORTto the agent. - The OpenTelemetry collector starts and reads the agent log pipe.
Control Loop
The control loop is the core orchestration mechanism. It runs inside the server, not in agents.Request Flow
- External event arrives (API call, Slack message, GitHub webhook)
- Server routes the event to the appropriate agent deployment
- Server invokes the agent’s
/_agent/chatendpoint with an invocation token - Agent processes the request and streams a response back (SSE)
- Server persists messages and run/step state to PostgreSQL and fans out to clients
Chat Run Lifecycle
- Each chat run has one or more steps stored in the DB.
- The server selects the latest step, invokes the active deployment, and streams chunks as they arrive.
- If the response includes tool calls, the server creates a new step and continues the loop.
- Interrupts cancel an in-flight step and restart with the latest state.
Streaming and Buffering
- The server broadcasts
message.chunk.addedevents to WebSocket and SSE clients. - The current streaming buffer is kept in memory to allow reconnects.
- This in-memory session state is the main blocker for horizontal scaling today.
Chat Run Sequence
Why the Control Loop is Server-Side
Running the control loop in the server rather than agents provides:- Centralized state in PostgreSQL
- Agent simplicity (no orchestration logic)
- Observability and auditability
- Consistent tool-call looping behavior
Request Routing
For details on webhook routing and devhooks, see the webhooks and devhooks guide.Communication
Server -> Agent
The server communicates with agents via HTTP:| Endpoint | Method | Purpose |
|---|---|---|
/_agent/health | GET | Health check |
/_agent/chat | POST | Chat request, SSE response |
/_agent/capabilities | GET | Check supported handlers |
/_agent/ui | GET | UI schema for dynamic inputs |
/_agent/flush-otel | POST | Flush telemetry buffers |
/_agent/* | ANY | Custom request handler |
/sendMessages or /_agent/send-messages.
All server -> agent calls include x-blink-invocation-token. Chat runs also include run, step, and chat ID headers.
Agent -> Server
Agents do not call the public API directly in containers. Instead, the wrapper exposes an internal API server:/kvfor agent key-value storage/chatfor chat CRUD and message operations/otlp/v1/tracesfor trace export (logs are forwarded by the collector)
Data and Storage
PostgreSQL stores:- chat messages, runs, and steps
- agents, deployments, and deployment targets
- files and attachments
- logs and traces (self-hosted)
Limitations
Current architectural constraints to be aware of:| Limitation | Details |
|---|---|
| Single node only | In-memory chat streaming buffers prevent horizontal scaling |
| Docker required | Agents must run as Docker containers (no Kubernetes, ECS, etc.) |
| Local Docker daemon | Server must have direct access to Docker socket |