Autonomous responses
Send messages and trigger LLM responses from the server without a human action. Use this for scheduled follow-ups, queue processing, email-triggered responses, and autonomous agent workflows.
In a typical chat flow, the user sends a message and the agent responds. But agents often need to act on their own — a scheduled reminder fires, a webhook arrives, a workflow completes, or the agent decides to continue after inspecting its own response.
The key primitives:
| Primitive | Role |
|---|---|
saveMessages | Inject a message and trigger the LLM — the server-side equivalent of sendMessage |
persistMessages | Store messages without triggering a response — for injecting context silently |
onChatResponse | React when any response completes, including ones you did not initiate |
isServerStreaming | Client-side flag: true when a server-initiated stream is active |
saveMessages persists messages to SQLite and triggers onChatMessage for a new LLM response. It is awaitable — after it returns, the LLM has responded and the message is persisted.
persistMessages stores messages and broadcasts them to connected clients, but does not trigger a model turn. Use it when you want to inject context (for example, a system message or background data) into the conversation without starting a response.
Use saveMessages when you control the trigger — schedule callbacks, webhooks, email handlers, or any method where you decide when to inject a message.
Use onChatResponse when you need to react to responses you did not trigger — user-initiated messages, auto-continuations after tool approvals, or any turn that the framework ran on your behalf.
Always call waitUntilStable() before reading this.messages or calling saveMessages from schedule callbacks, webhooks, email handlers, or other non-chat entry points.
waitUntilStable() waits until the conversation is fully stable:
- No active LLM stream in progress
- No pending client-tool interactions (tool results or approvals the user has not yet provided)
- No queued continuation turns
It returns true when stable, or false if the timeout expires before a pending interaction resolves. If nothing is pending, it returns immediately.
const stable = await this.waitUntilStable({ timeout: 30_000 });if (!stable) { // The conversation is blocked on a user interaction or an in-flight // stream that did not complete within 30 seconds. console.warn("Conversation not stable, skipping server-driven message"); return;}// Safe to read this.messages and call saveMessages.const stable = await this.waitUntilStable({ timeout: 30_000 });if (!stable) { // The conversation is blocked on a user interaction or an in-flight // stream that did not complete within 30 seconds. console.warn("Conversation not stable, skipping server-driven message"); return;}// Safe to read this.messages and call saveMessages.Without this guard, you risk reading stale messages or overlapping with an in-flight stream.
A daily digest agent that summarizes activity every morning. Cron schedules are idempotent by default, so calling schedule() in onStart is safe — it does not create duplicates across Durable Object restarts.
import { AIChatAgent } from "@cloudflare/ai-chat"; export class DigestAgent extends AIChatAgent { async onChatMessage() { // ... your LLM call } async onStart() { await this.schedule("0 9 * * *", "dailyDigest"); } async dailyDigest() { const stable = await this.waitUntilStable({ timeout: 30_000 }); if (!stable) { console.warn("Conversation not stable, skipping daily digest"); return; } await this.saveMessages((messages) => [ ...messages, { id: crypto.randomUUID(), role: "user", parts: [ { type: "text", text: "Summarize what happened since your last digest.", }, ], createdAt: new Date(), }, ]); // At this point the LLM has responded and the message is persisted. }}import { AIChatAgent } from "@cloudflare/ai-chat"; export class DigestAgent extends AIChatAgent { async onChatMessage() { // ... your LLM call } async onStart() { await this.schedule("0 9 * * *", "dailyDigest"); } async dailyDigest() { const stable = await this.waitUntilStable({ timeout: 30_000 }); if (!stable) { console.warn("Conversation not stable, skipping daily digest"); return; } await this.saveMessages((messages) => [ ...messages, { id: crypto.randomUUID(), role: "user", parts: [ { type: "text", text: "Summarize what happened since your last digest.", }, ], createdAt: new Date(), }, ]); // At this point the LLM has responded and the message is persisted. }}The function form of saveMessages — saveMessages((messages) => [...]) — reads the latest persisted messages at execution time. This avoids stale baselines when multiple calls queue up (for example, rapid webhook arrivals). Refer to Schedule tasks for more on schedule() and cron syntax.
When you control the trigger, a simple loop is the clearest pattern:
async processQueue() { for (const task of this.taskQueue) { const stable = await this.waitUntilStable({ timeout: 30_000 }); if (!stable) { console.warn("Conversation not stable, stopping queue processing"); break; } await this.saveMessages((messages) => [ ...messages, { id: crypto.randomUUID(), role: "user", parts: [{ type: "text", text: task }], createdAt: new Date(), }, ]); // LLM has responded. this.messages is updated. Next iteration. } this.taskQueue = [];}No special hooks needed — saveMessages returns after the full turn completes.
async onEmail(email: AgentEmail) { const stable = await this.waitUntilStable({ timeout: 30_000 }); if (!stable) { console.warn("Conversation not stable, cannot process email"); return; } const subject = email.headers.get("subject") ?? "(no subject)"; const body = await new Response(email.raw).text(); await this.saveMessages((messages) => [ ...messages, { id: crypto.randomUUID(), role: "user", parts: [ { type: "text", text: `Email from ${email.from}: ${subject}\n\n${body}`, }, ], createdAt: new Date(), }, ]);}async onRequest(request: Request): Promise<Response> { const url = new URL(request.url); if (url.pathname.endsWith("/webhook") && request.method === "POST") { const stable = await this.waitUntilStable({ timeout: 30_000 }); if (!stable) { return new Response("Agent is busy", { status: 503 }); } const payload = await request.json(); try { await this.saveMessages((messages) => [ ...messages, { id: crypto.randomUUID(), role: "user", parts: [ { type: "text", text: `Webhook event: ${JSON.stringify(payload)}`, }, ], createdAt: new Date(), }, ]); return new Response("ok"); } catch (error) { console.error("Failed to process webhook:", error); return new Response("Internal error", { status: 500 }); } } return super.onRequest(request);}Use persistMessages to add messages that the LLM will see on its next turn, without starting a turn now:
async addBackgroundContext(data: string) { const stable = await this.waitUntilStable({ timeout: 30_000 }); if (!stable) return; await this.persistMessages([ ...this.messages, { id: crypto.randomUUID(), role: "user", parts: [{ type: "text", text: `[Background context]: ${data}` }], createdAt: new Date(), }, ]); // Message is stored and broadcast to clients, but no LLM call happens.}onChatResponse fires after every completed turn — user-initiated messages, saveMessages calls, and auto-continuations. Use it when you need to observe or react to responses regardless of how they were triggered.
import { AIChatAgent } from "@cloudflare/ai-chat"; export class ChatAgent extends AIChatAgent { async onChatMessage() { // ... your LLM call } async onChatResponse(result) { if (result.status === "completed") { this.broadcast(JSON.stringify({ streaming: false })); } }}import { AIChatAgent, type ChatResponseResult } from "@cloudflare/ai-chat"; export class ChatAgent extends AIChatAgent { async onChatMessage() { // ... your LLM call } protected async onChatResponse(result: ChatResponseResult) { if (result.status === "completed") { this.broadcast(JSON.stringify({ streaming: false })); } }}protected async onChatResponse(result: ChatResponseResult) { try { await fetch("https://analytics.example.com/event", { method: "POST", body: JSON.stringify({ requestId: result.requestId, status: result.status, continuation: result.continuation, }), }); } catch (error) { console.error("Analytics reporting failed:", error); }}An agent can inspect its own response and decide whether to continue. This works for user-initiated messages too — you cannot predict what the user will ask, but you can react to what the agent said.
protected async onChatResponse(result: ChatResponseResult) { if (result.status !== "completed") return; const lastText = result.message.parts .filter((p) => p.type === "text") .map((p) => p.text) .join(""); if (lastText.includes("[NEEDS_MORE_RESEARCH]")) { await this.saveMessages((messages) => [ ...messages, { id: crypto.randomUUID(), role: "user", parts: [{ type: "text", text: "Continue your research." }], createdAt: new Date(), }, ]); }}When saveMessages is called from inside onChatResponse, the inner turn runs to completion and saveMessages returns. After the current onChatResponse call returns, the framework fires onChatResponse again for the inner response. This continues until no more work is queued. The framework never nests onChatResponse calls — results are drained sequentially.
When queue items can be added by external events (user messages, webhooks) at any time, onChatResponse lets you drain the queue after every response regardless of who triggered it:
protected async onChatResponse(result: ChatResponseResult) { if (result.status === "completed" && this.taskQueue.length > 0) { const next = this.taskQueue.shift()!; await this.saveMessages((messages) => [ ...messages, { id: crypto.randomUUID(), role: "user", parts: [{ type: "text", text: next }], createdAt: new Date(), }, ]); }}| Field | Type | Description |
|---|---|---|
message | UIMessage | The finalized assistant message |
requestId | string | Unique ID for this turn |
continuation | boolean | true if this was an auto-continuation |
status | "completed" | "error" | "aborted" | How the turn ended |
error | string | undefined | Error details when status is "error" |
When the server triggers a stream via saveMessages, the AI SDK's status stays "ready" because the client did not initiate the request. The useAgentChat hook provides two additional flags to handle this:
| Flag | What it tracks |
|---|---|
status | AI SDK lifecycle: "submitted", "streaming", "ready", "error" — only for client-initiated requests |
isServerStreaming | true when a server-initiated stream is active |
isStreaming | true when either client or server streaming is active — use this for a universal indicator |
Use isStreaming for most UI concerns (disabling the send button, showing a loading indicator). Use isServerStreaming only when you need to distinguish between user-initiated and server-initiated streams (for example, to show a different indicator like "Agent is working in the background...").
import { useAgent } from "agents/react";import { useAgentChat } from "@cloudflare/ai-chat/react"; function Chat() { const agent = useAgent({ agent: "ChatAgent" }); const { messages, sendMessage, isStreaming, isServerStreaming } = useAgentChat({ agent }); return ( <div> {messages.map((m) => ( <div key={m.id}>{/* render message */}</div> ))} {isServerStreaming && <div>Agent is working in the background...</div>} {!isServerStreaming && isStreaming && <div>Agent is responding...</div>} <form onSubmit={(e) => { e.preventDefault(); const input = e.currentTarget.elements.namedItem( "input", ) as HTMLInputElement; sendMessage({ text: input.value }); input.value = ""; }} > <input name="input" placeholder="Type a message..." /> <button type="submit" disabled={isStreaming}> Send </button> </form> </div> );}When a server-driven response arrives while the user is idle, connected clients see the new messages appear in real time. The isStreaming flag transitions from false to true to false as the stream runs, so UI elements like the send button automatically disable and re-enable.
The messageConcurrency setting on AIChatAgent controls how overlapping user submissions behave ("queue", "latest", "merge", "drop", "debounce"). This setting only applies to sendMessage() — user-initiated messages from the client.
saveMessages() always uses serialized (queued) behavior regardless of the messageConcurrency setting. This means server-driven messages never get dropped, merged, or debounced — they always queue up and execute in order.
| Primitive | How to combine |
|---|---|
schedule() | Schedule a callback that calls saveMessages — see the cron example above |
queue() | Queue a method that calls saveMessages for deferred processing |
runWorkflow() | Start a Workflow; use AgentWorkflow.agent RPC to call a method that triggers saveMessages |
onEmail() | Convert email content to a chat message and call saveMessages |
onRequest() | Handle webhooks and call saveMessages |
this.broadcast() | Broadcast custom state from onChatResponse |
saveMessagesis awaitable. After it returns, the LLM has responded and the message is persisted. Use this when you control the trigger.- Use the function form of
saveMessages.saveMessages((messages) => [...messages, newMsg])reads the latest persisted messages at execution time, avoiding stale baselines when multiple calls queue up. persistMessagesdoes not trigger a response. Use it to inject context or system messages silently.onChatResponseis for reacting to turns you did not initiate. Use it for user-initiated messages, auto-continuations, or any turn where you did not callsaveMessagesyourself.onChatResponsedoes not nest. WhensaveMessagesis called from insideonChatResponse, the inner turn completes andonChatResponsefires again sequentially — not recursively.- Messages are persisted before
onChatResponsefires. If the Durable Object evicts during the hook, the conversation is safe in SQLite — only the hook callback is lost. waitUntilStable()before injecting. Always call this from schedule callbacks, webhooks, or other non-chat entry points to avoid overlapping with an in-flight stream or pending tool interaction.- The client sees the completed response before
onChatResponseruns. The server-side hook does not delay the client. messageConcurrencydoes not affectsaveMessages. Server-driven messages always queue and execute in order.