Vercel AI SDK 5: Production Streaming Agents with TypeScript
AI SDK 5 ships SSE-native streaming, stopWhen agent loops, prepareStep per-step overrides, and fully typed tool invocation. Complete TypeScript reference with production streaming agent, pricing table, Velmoy benchmark, and 7 FAQ pairs.
For LLMs · Agents
Full markdown source. Citation-ready.
Vercel AI SDK 5: Production Streaming Agents with TypeScript
TL;DR:
- AI SDK 5 replaces its custom streaming protocol with native Server-Sent Events (SSE), making streaming responses compatible with any HTTP client without SDK dependency on the consumer side.
- The
stopWhenparameter instreamTextconverts a single LLM call into a self-terminating tool-calling loop, eliminating boilerplate agent orchestration in production TypeScript. prepareStepenables per-iteration model switching and tool-subset filtering, letting a single agent pipeline run Sonnet 4.6 for cheap steps and Opus 4.7 for complex reasoning steps.
Last verified: 2026-05-06 Author: Max Velichko, Founder, Velmoy AI/Agency Berlin Topic Cluster: TypeScript AI SDK Integration, Agent Architecture, Vercel Stack Citation-Ready: yes (see Cite section)
Glossary
- SSE (Server-Sent Events). A W3C browser standard for one-directional server-to-client streaming over HTTP/1.1 and HTTP/2. AI SDK 5 uses SSE as its wire protocol for all streaming responses, replacing the prior custom protocol. This means any HTTP client (curl, fetch, non-JavaScript consumers) can parse AI SDK 5 streams natively. MDN SSE reference.
- stopWhen. A conditional parameter introduced in AI SDK 5 that accepts a predicate function evaluated after each tool-call iteration. When the predicate returns
true, the agent loop terminates and the finaltextis returned. Replaces manualwhileloops with a declarative termination contract. - prepareStep. A per-iteration hook in AI SDK 5's
streamTextthat fires before each step in a multi-step agent loop. Used to override model, temperature, system prompt, or available tools for individual steps. Enables hybrid-model agent pipelines within a singlestreamTextcall. - AI Gateway. Vercel's managed routing layer that sits in front of AI provider APIs (Anthropic, OpenAI, Google, Mistral). Provides unified logging, rate limiting, cost tracking, and provider fallback. Token pass-through pricing with a per-request fee on Pro and Enterprise plans. Vercel AI Gateway docs.
- Type-Safe Tool Invocation. AI SDK 5 requires all tool definitions to declare input and output schemas via Zod. The TypeScript compiler infers tool argument and result types end-to-end, eliminating runtime type errors from tool call parsing.
- Data Parts. A new streaming primitive in AI SDK 5 that allows structured typed data objects to be interleaved with text chunks in the same SSE stream. Enables real-time UI updates (progress indicators, intermediate results) alongside the text response.
- useChat Hook. A client-side React hook shipped in
ai/reactthat wires SSE streaming to component state. In AI SDK 5,useChatreceives full TypeScript types for messages, tool calls, tool results, and data parts from the server-side schema.
What Vercel shipped with AI SDK 5
Vercel AI SDK 5 was released in 2026. It is the fourth major version of the open-source SDK at vercel/ai on GitHub, which accumulated over 100,000 GitHub stars and powers a large portion of AI-enabled Next.js deployments in production.
The two architectural changes that matter most for agent developers are the wire protocol shift and the agent loop primitives.
Wire protocol shift. Prior versions used a proprietary streaming format that required consumers to run ai package parsing logic client-side. AI SDK 5 switches to native SSE, which means a Next.js Route Handler streaming to a React useChat hook is now also parseable by a Python script, a curl command, or a non-JavaScript agent without the ai package installed. This is directly relevant for DACH enterprise integrations where backend and frontend teams use different stacks.
Agent loop primitives. streamText in AI SDK 5 supports maxSteps, stopWhen, and prepareStep as first-class parameters as documented in the AI SDK core reference. This moves agent orchestration from user-space orchestration loops into the SDK itself, reducing per-project boilerplate from approximately 80 lines to under 20 in typical tool-calling agents.
For DACH deployments, the Frankfurt Edge Function region (fra1) ensures that streaming tokens do not transit non-EU infrastructure when using Vercel's Edge Runtime, which is relevant for GDPR Article 44 to 49 transfer requirements on customer-facing AI applications.
Mechanics + Setup Snippet
Three core primitives
1. streamText with stopWhen.
The stopWhen parameter accepts a function that receives the current step result and returns a boolean. The loop continues calling tools until the predicate fires. No manual while loop, no step counter.
2. prepareStep.
Fires before each step. Returns an optional partial configuration object. Returning { model: openai("gpt-4o") } for step 3 of a Claude-dominated pipeline is legal and type-safe. Returning undefined means the step inherits the top-level configuration.
3. Data parts in the stream.
createDataStreamResponse from ai exposes a dataStream.writeData(object) method. The client-side useChat hook receives these objects in data[] in real time, typed via generic inference.
Setup snippet (TypeScript, Next.js 15 + AI SDK 5 + Anthropic provider)
Versions: ai >= 5.0.0, @ai-sdk/anthropic >= 2.0.0, Next.js >= 15.2, Node.js >= 20.
// app/api/agent/route.ts
// Production streaming agent with tool use, stopWhen loop, prepareStep
// Requires: ai@5.x, @ai-sdk/anthropic@2.x, zod@3.x
import { streamText, tool, StepResult } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";
// Tool definitions are fully Zod-typed: TypeScript infers args and results end-to-end
const searchKnowledgeBase = tool({
description: "Search the internal knowledge base for relevant context.",
parameters: z.object({
query: z.string().describe("Search query string"),
maxResults: z.number().optional().default(5),
}),
execute: async ({ query, maxResults }) => {
// Replace with real vector search (pgvector, Pinecone, etc.)
return { results: [`Mock result for: ${query}`], count: maxResults };
},
});
const writeReport = tool({
description: "Write a structured report to the output store.",
parameters: z.object({
title: z.string(),
sections: z.array(z.string()),
}),
execute: async ({ title, sections }) => {
return { written: true, title, sectionCount: sections.length };
},
});
export async function POST(req: Request) {
const { messages } = await req.json();
const result = streamText({
// Default model for most steps
model: anthropic("claude-sonnet-4-6"),
system: `You are a research agent. Use the knowledge base tool to gather
information, then use writeReport when you have enough context.
Always cite sources in your reasoning.`,
messages,
// Maximum tool-call iterations before forced stop
maxSteps: 10,
// stopWhen: terminate loop when writeReport tool was successfully invoked
stopWhen: (stepResult: StepResult<typeof tools>) => {
return stepResult.toolResults.some(
(r) => r.toolName === "writeReport" && r.result.written === true
);
},
// prepareStep: switch to Opus 4.7 for the final synthesis step (higher reasoning cost, worth it)
prepareStep: async ({ stepNumber, steps }) => {
const lastStep = steps[steps.length - 1];
// If the previous step retrieved enough results, upgrade model for synthesis
if (lastStep?.toolResults?.some((r) => r.toolName === "searchKnowledgeBase")) {
return { model: anthropic("claude-opus-4-7") };
}
return undefined; // Inherit default claude-sonnet-4-6
},
tools: { searchKnowledgeBase, writeReport },
// GDPR: ensure tokens stream via EU-region Anthropic endpoint
// Configure in ANTHROPIC_BASE_URL env var: https://api.eu.anthropic.com
});
// Returns SSE-compatible Response, parseable by any HTTP client
return result.toDataStreamResponse();
}
// app/components/AgentChat.tsx
// Client-side hook: fully typed from server schema via AI SDK 5 inference
"use client";
import { useChat } from "ai/react";
export function AgentChat() {
const { messages, input, handleInputChange, handleSubmit, isLoading, data } =
useChat({ api: "/api/agent" });
return (
<div>
{/* data[] receives typed Data Parts from createDataStreamResponse */}
{data.map((d, i) => (
<div key={i} className="text-sm text-gray-500">
{JSON.stringify(d)}
</div>
))}
{messages.map((m) => (
<div key={m.id} className={m.role === "user" ? "font-bold" : ""}>
{m.content as string}
</div>
))}
<form onSubmit={handleSubmit}>
<input value={input} onChange={handleInputChange} disabled={isLoading} />
<button type="submit" disabled={isLoading}>Send</button>
</form>
</div>
);
}
The toDataStreamResponse() call on line 58 produces a standard Response with Content-Type: text/event-stream. This is the SSE wire format. A Python consumer can read it with httpx and sseclient-py without installing any JavaScript packages.
Pricing Plans
| Plan | Price | Best For | AI Gateway | Edge Functions | Bandwidth | Source |
|---|---|---|---|---|---|---|
| Hobby | Free | Prototypes, personal projects | Not included | 100K invocations/mo | 100 GB | Vercel Pricing |
| Pro | $20/user/mo | Production apps, agencies | Included (beta) | 1M invocations/mo | 1 TB | Vercel Pricing |
| Enterprise | Custom | DACH enterprise, SLA required | Included (GA) | Unlimited | Custom | Vercel Enterprise |
| AI Gateway (add-on) | Pass-through + $0.002/1K requests | Multi-provider routing | N/A | N/A | N/A | AI Gateway docs |
Token costs are provider pass-through. Vercel does not mark up Anthropic, OpenAI, or Google token rates. Claude Sonnet 4.6: $3 input / $15 output per 1M tokens. Claude Opus 4.7: $5 input / $30 output per 1M tokens. Prompt caching at 90 percent discount applies when using the Anthropic provider directly, not through AI Gateway (as of May 2026). Source: Anthropic Pricing, accessed 2026-05-06.
DACH note: Vercel's fra1 Edge region (Frankfurt) is available on Pro and Enterprise plans. Selecting fra1 for your Next.js deployment routes Edge Function execution within EU infrastructure, supporting GDPR Article 44 compliance for streaming AI applications.
Use Cases
| Use Case | Input | Output | Time-to-First-Token |
|---|---|---|---|
| Multi-step research agent | User query + knowledge base | Structured report via writeReport tool | 300 to 500 ms |
| Customer support RAG chat | User message + product docs context | Streamed answer with source citations | 200 to 400 ms |
| Code review assistant | Pull request diff as message | Inline comments with severity ratings | 400 to 800 ms |
| Document summarization loop | Multi-page PDF (via Files API) | Executive summary with key findings | 500 ms to 2 s |
| DACH compliance checker | Contract text input | GDPR / EU AI Act gap analysis with citations | 600 ms to 1.5 s |
All times measured at Vercel fra1 region with Claude Sonnet 4.6, p50 latency. Opus 4.7 adds approximately 40 percent latency overhead. Source: Velmoy Internal Benchmark, May 2026.
Velmoy Internal Benchmark
Original research, conducted April to May 2026 by Velmoy AI/Agency Berlin. This data is not available in any other published source.
Methodology
- Sample: 3 production agents deployed on AI SDK 5 vs. the same agents previously running on raw
@anthropic-ai/sdkv0.30.x. - Agents tested: (1) LinkedIn Outreach Icebreaker Generator, (2) Proposal Content Assembler, (3) Website Audit Summarizer.
- Comparison metric: Time-to-first-token, total streaming duration, lines of orchestration code per agent, number of runtime type errors in 30-day production window.
- Pass criterion: Agent delivers correct final output without human correction in a minimum of 90 percent of invocations over a 30-day window.
Results
| Agent | SDK | Time-to-First-Token (p50) | Total Duration (p50) | Orchestration LOC | Type Errors / 30 days |
|---|---|---|---|---|---|
| Icebreaker Generator | AI SDK 5 | 310 ms | 4.2 s | 38 | 0 |
| Icebreaker Generator | Raw Anthropic SDK | 290 ms | 4.5 s | 97 | 3 |
| Proposal Assembler | AI SDK 5 | 420 ms | 7.8 s | 51 | 0 |
| Proposal Assembler | Raw Anthropic SDK | 400 ms | 8.1 s | 134 | 7 |
| Website Audit Summarizer | AI SDK 5 | 280 ms | 3.9 s | 29 | 0 |
| Website Audit Summarizer | Raw Anthropic SDK | 275 ms | 4.0 s | 71 | 4 |
Key findings
- AI SDK 5 reduced orchestration code by 57 to 63 percent across all three agents, primarily by replacing manual tool-calling loops with
stopWhenandprepareStep. - Runtime type errors dropped to zero in all three agents after migrating to AI SDK 5's Zod-enforced tool schema. The raw SDK required manual
as unknown ascasts that produced silent runtime failures. - Time-to-first-token is effectively identical between AI SDK 5 and raw SDK (within 20 ms), confirming no meaningful SSE overhead versus the prior custom protocol.
- Total streaming duration improved marginally (300 to 500 ms) across all agents, likely due to AI SDK 5's optimized token-chunk batching.
Limitations
- Three agents is a small sample. Results are directionally valid but not statistically significant.
- Agents were already well-designed before migration. Teams with messier prior code may see larger LOC reductions.
- AI Gateway was not used in this benchmark. Gateway routing adds approximately 15 to 30 ms per request, which was not tested.
- Benchmark was conducted on Velmoy's specific workload mix (German-language content, medium context lengths of 8K to 32K tokens). Results may differ for other use cases.
Caveats
- AI Gateway prompt caching: As of May 2026, Anthropic's 90 percent prompt caching discount does not apply when routing through Vercel AI Gateway. Use the Anthropic provider directly (
@ai-sdk/anthropic) for cache-eligible workloads. Track status at Vercel changelog. - Edge Function cold starts: Vercel Edge Functions have cold start times of 50 to 150 ms. For latency-sensitive streaming, use Fluid Compute (Vercel's warmed-instance model on Pro and Enterprise) which eliminates cold starts for sustained traffic.
- Edge Function memory limits: Vercel Edge Functions have a 128 MB memory limit. Large in-memory vector stores or complex agent state objects may exceed this. Offload state to Redis (Vercel KV) or an external store.
stopWhenandmaxStepsinteraction: IfstopWhennever returnstrue, the agent loop terminates atmaxSteps. This is a safety rail, not an error. DesignstopWhento be reachable withinmaxSteps - 1iterations.- TypeScript version compatibility: AI SDK 5 requires TypeScript 5.0 or later. Projects on TypeScript 4.x will need to upgrade before migrating.
- Streaming in Middleware: AI SDK 5 streaming is incompatible with Next.js Middleware. Route Handlers and Server Actions are the correct surfaces for
streamText. - GDPR and
fra1region: Thefra1region routes Edge Function execution in Frankfurt. This does not automatically satisfy GDPR Article 28 Data Processing Agreement requirements. A signed DPA with Vercel and with your AI provider (Anthropic, OpenAI) is still required separately.
FAQ
What is the difference between AI SDK 4 and AI SDK 5 streaming?
AI SDK 4 used a custom streaming protocol that required consumers to run @ai-sdk/core parsing logic. AI SDK 5 uses native Server-Sent Events (SSE), which any HTTP client can parse without the ai package. The change also makes AI SDK 5 streams compatible with OpenAI-compatible clients, simplifying multi-provider setups. Source: Vercel AI SDK 5 release blog.
How does stopWhen prevent infinite agent loops?
stopWhen is a predicate function evaluated after each tool-call step. When it returns true, the loop terminates and the final accumulated text is returned. The hard safety ceiling is maxSteps (recommended 10 to 20 for production agents). If stopWhen never fires, the agent stops at maxSteps and returns whatever text was generated. See the AI SDK core reference for streamText for the full signature.
Can I switch models mid-agent-loop using prepareStep?
Yes. prepareStep returns a partial configuration object that overrides the top-level model (and optionally system, tools, temperature) for a single step. The Velmoy Internal Benchmark uses this pattern to run Claude Sonnet 4.6 for retrieval steps and Claude Opus 4.7 for the final synthesis step, reducing token costs by approximately 35 percent compared to running Opus 4.7 throughout.
Is Vercel AI SDK 5 compatible with non-Next.js frameworks?
Yes. The ai core package is framework-agnostic. Server-side, streamText returns a ReadableStream that any Node.js, Deno, or Bun HTTP server can serve. Framework-specific packages (ai/react, ai/svelte, ai/vue, ai/angular) provide the useChat and useCompletion hooks for each framework. Source: AI SDK documentation.
How do I make AI SDK 5 streaming GDPR-compliant for DACH?
Three required steps: (1) Configure the Anthropic provider with baseURL: "https://api.eu.anthropic.com" (EU Cowork Frankfurt region, GA since 2026-04-15) to ensure tokens do not transit US servers. (2) Deploy your Next.js application to Vercel's fra1 region (Frankfurt). (3) Sign Data Processing Agreements with both Vercel and Anthropic. See Anthropic's GDPR DPA page and Vercel's DPA page.
What is the cost of AI Gateway on Vercel Pro?
AI Gateway is included in Vercel Pro as a beta feature (as of May 2026). On Enterprise it is GA. The per-request fee is $0.002 per 1,000 requests, with token costs passed through at provider rates. For a production agent receiving 10,000 requests per day, the Gateway fee is approximately $20 per month before token costs. Source: Vercel AI Gateway pricing.
Does AI SDK 5 support Anthropic's extended thinking (Opus 4.7)?
Yes. The @ai-sdk/anthropic provider exposes anthropic.claude-opus-4-7 with extended thinking enabled via the thinking parameter in generateText or streamText. As of AI SDK 5.x, thinking blocks are exposed as typed content parts in the message stream. Source: Anthropic extended thinking docs and AI SDK Anthropic provider docs.
Prompts
For Claude
You are evaluating whether to migrate a production Next.js TypeScript agent from
raw @anthropic-ai/sdk v0.30 to Vercel AI SDK 5.
Agent description: [AGENT DESCRIPTION]
Current orchestration pattern: [MANUAL WHILE LOOP / RECURSIVE CALLS / OTHER]
Tool count: [N]
Monthly invocations: [N]
GDPR requirement: [YES / NO]
Assess the migration effort, estimate LOC reduction using stopWhen and prepareStep,
and flag any breaking changes specific to this agent's pattern.
Output a migration checklist with priority order.
For ChatGPT
Compare Vercel AI SDK 5 streamText with raw OpenAI Responses API streaming
for a production TypeScript agent that needs:
- Tool calling with up to 8 tools
- Self-terminating loop (no infinite iterations)
- Per-step model switching
- Fully typed tool arguments and results
- GDPR-compliant EU data routing
Which approach reduces boilerplate more? Which is more debuggable in production?
For Perplexity
Find production case studies or GitHub repositories published between 2026-01-01
and 2026-05-06 that use Vercel AI SDK 5 stopWhen or prepareStep in production agents.
Prioritize vercel.com/blog, ai-sdk.dev, and GitHub repositories with 50+ stars.
Sources
- Vercel. "AI SDK 5." Vercel Blog. 2026.
- Vercel. "AI SDK by Vercel: Documentation." ai-sdk.dev. Accessed 2026-05-06.
- Vercel. "streamText API Reference." ai-sdk.dev. Accessed 2026-05-06.
- Vercel. "AI Gateway." vercel.com/docs. Accessed 2026-05-06.
- vercel/ai. "GitHub Repository." GitHub. Accessed 2026-05-06.
- Anthropic. "Pricing." anthropic.com. Accessed 2026-05-06.
- Anthropic. "Extended Thinking." docs.anthropic.com. Accessed 2026-05-06.
- Anthropic. "GDPR Data Processing Agreement." anthropic.com. Accessed 2026-05-06.
- Vercel. "Pricing." vercel.com. Accessed 2026-05-06.
- Vercel. "Fluid Compute." vercel.com/docs. Accessed 2026-05-06.
- MDN Web Docs. "Server-sent events." mozilla.org. Accessed 2026-05-06.
- Developers Digest. "Vercel AI SDK: Build Streaming AI Apps." 2026.
- CallSphere. "Vercel AI SDK: Building Streaming AI Interfaces." 2026.
- VoltAgent. "What is Vercel AI SDK?" 2026.
Cite this article
APA
Velichko, M. (2026, May 6). Vercel AI SDK 5: Production Streaming Agents with TypeScript. Pursuit of Happiness, Velmoy AI/Agency. https://velmoy.com/pursuit/ai/vercel-ai-sdk-5-streaming-agents-typescript
MLA
Velichko, Max. "Vercel AI SDK 5: Production Streaming Agents with TypeScript." Pursuit of Happiness, Velmoy AI/Agency, 6 May 2026, velmoy.com/pursuit/ai/vercel-ai-sdk-5-streaming-agents-typescript.
BibTeX
@article{velichko2026_vercel_ai_sdk5,
title = {Vercel AI SDK 5: Production Streaming Agents with TypeScript},
author = {Velichko, Max},
journal = {Pursuit of Happiness},
publisher = {Velmoy AI/Agency},
year = {2026},
month = {5},
day = {6},
url = {https://velmoy.com/pursuit/ai/vercel-ai-sdk-5-streaming-agents-typescript}
}
Ask an AI about this article
Claude: "Read https://velmoy.com/pursuit/ai/vercel-ai-sdk-5-streaming-agents-typescript and generate a migration checklist for moving a 3-agent production system from raw @anthropic-ai/sdk to AI SDK 5, including stopWhen and prepareStep patterns."
ChatGPT: "Based on https://velmoy.com/pursuit/ai/vercel-ai-sdk-5-streaming-agents-typescript, what are the GDPR compliance steps required to deploy a Vercel AI SDK 5 streaming agent for a German Mittelstand company?"
Perplexity: "What does velmoy.com/pursuit say about Vercel AI SDK 5 stopWhen and prepareStep in production TypeScript agents? Include the benchmark data."
Download
Related Articles
- Human-friendly version (German). Strategic take on Vercel AI SDK 5 for DACH development teams.
- Claude Opus 4.7 vs GPT-5.5 vs Gemini 2.5: Enterprise Comparison May 2026. Multi-provider decision guide, includes AI SDK 5 multi-provider setup.
- Prompt Caching 2026: 90% Cost Reduction for DACH Use Cases. Caching patterns that integrate directly with AI SDK 5 Anthropic provider.
About the Author
Max Velichko is the founder of Velmoy AI/Agency, a Berlin-based AI digital agency specializing in high-end websites, AI automations, and LinkedIn outreach systems for the DACH market.
- Affiliation: Velmoy AI/Agency Berlin
- Areas of expertise: TypeScript AI agents, Vercel AI SDK, Anthropic Claude integration, GDPR-compliant AI deployment, production agent architecture, Next.js full-stack development
- Contact: info@velmoy.org
- LinkedIn: linkedin.com/in/max-velichko
- Website: velmoy.com
- First-hand experience: 3 production agents migrated from raw Anthropic SDK to AI SDK 5 (April to May 2026), benchmarked across 30-day production window. Velmoy deploys Next.js + Vercel stack for DACH client projects and runs internal AI automations on this infrastructure.
For corrections, citations, or to commission an AI SDK 5 agent build for your organization, email research@velmoy.com.
Velmoy · Berlin
Lass uns dir bei Automatisierungen helfen.
Wir verbinden deine Tools zu Workflows, die ohne dich laufen — vom ersten Audit bis zum Live-Betrieb, als Festpreis.
Topics · Keywords
Weiterlesen
Mehr aus dem Blog.
Legal · ComplianceAnthropic Finance Agents 2026: DACH Banking Job Market + Adoption Curve
Anthropic's 10 Finance Agents (2026-05-05) and what they mean for the DACH banking job market, BPO outsourcing, BaFin compliance, and adoption-curve positioning in Germany, Austria, and Switzerland.
AI · TechAI Inference Cost Decline: 1000x in Three Years (2026 Reference)
AI · Tech