Željko Šević | Node.js Developer

Building AI agents with Vercel AI SDK

June 9, 2026

The Vercel AI SDK treats agents as tool-calling loops: the model generates text or invokes tools, the SDK runs those tools, and the loop continues until the model answers or a stop condition fires.

This post builds a support triage agent that looks up customers and invoices, searches an internal knowledge base, and either opens a ticket or escalates to a human. It builds on the LLM integration with Vercel AI SDK post and focuses on multiple tools, stopWhen, and stepCountIs.

For external tools exposed over MCP instead of SDK-native tool() handlers, see the MCP server with Node.js post. For the same triage scenario with the official OpenAI Agents SDK (@openai/agents, run(), maxTurns), see the dedicated post. For the LangChain stack (createAgent, tool(), LangGraph loop), see Building AI agents with LangChain.

Prerequisites

OpenAI account
Generated API key
Enabled billing
Node.js version 26
ai, @ai-sdk/openai, and zod installed (npm i ai @ai-sdk/openai zod)
Client setup from the Vercel AI SDK integration post

Mental model - steps and the tool loop

A step is one model generation. In that step the model either:

returns text (the loop ends), or
returns tool calls (the SDK executes them and starts another step with the results)

Typical flow for the support triage agent: user question → model calls lookup tools (getCustomer, getInvoice, searchKnowledgeBase) → model creates a ticket or escalates → final answer. stopWhen can end the loop before or after the write tools run.

stepCountIs(5) means "stop after 5 steps" (five model generations), not five individual tool calls. A single step can include multiple parallel tool calls.

When you pass tools without stopWhen, the SDK defaults to stepCountIs(20) as a safety cap.

Support triage scenario

Example prompt:

Customer cus_1042 says they were charged twice for invoice inv_8891. What should we do?

A realistic chain:

getCustomer - plan tier, open ticket count
getInvoice - amount, status, payment IDs
searchKnowledgeBase - duplicate-charge and refund policy
createSupportTicket or escalateToHuman - write action or sentinel stop

The demo uses in-memory fixtures (customers, invoices, knowledge-base articles) so scripts run without a database.

Defining multiple tools

import { tool } from 'ai';
import { z } from 'zod';
const getCustomer = tool({
  description: 'Look up a customer account by ID',
  inputSchema: z.object({
    customerId: z.string().describe('Customer ID, e.g. cus_1042'),
  }),
  execute: async ({ customerId }) => {
    const customer = customers.find((item) => item.id === customerId);
    if (!customer) {
      return { found: false, customerId, error: 'Customer not found' };
    }
    return { found: true, customer };
  },
});
const getInvoice = tool({
  description: 'Look up an invoice by ID, including payment IDs and status',
  inputSchema: z.object({
    invoiceId: z.string().describe('Invoice ID, e.g. inv_8891'),
  }),
  execute: async ({ invoiceId }) => {
    const invoice = invoices.find((item) => item.id === invoiceId);
    if (!invoice) {
      return { found: false, invoiceId, error: 'Invoice not found' };
    }
    return { found: true, invoice };
  },
});
const searchKnowledgeBase = tool({
  description: 'Search internal support articles by keyword',
  inputSchema: z.object({
    query: z.string().describe('Search terms, e.g. duplicate charge refund'),
  }),
  execute: async ({ query }) => {
    // keyword match against mocked articles
    return { query, articles: matches };
  },
});

Add write tools for outcomes:

const createSupportTicket = tool({
  description: 'Create a support ticket after gathering customer and policy context',
  inputSchema: z.object({
    customerId: z.string(),
    subject: z.string().min(3),
    priority: z.enum(['low', 'medium', 'high']),
    summary: z.string().min(10),
  }),
  execute: async (input) => {
    const ticket = createTicket(input);
    return { created: true, ticket };
  },
});
const escalateToHuman = tool({
  description: 'Escalate when policy requires manual review',
  inputSchema: z.object({
    customerId: z.string(),
    reason: z.string().min(10),
    urgency: z.enum(['normal', 'high']),
  }),
  execute: async (input) => ({
    escalated: true,
    queue: input.urgency === 'high' ? 'billing-urgent' : 'billing-standard',
    ...input,
  }),
});

Return structured objects from execute. The SDK serializes them as tool results for the next step. Return explicit errors (for example { found: false, error: '...' }) so the model can recover instead of throwing.

Multi-step triage with `generateText`

Pass all tools and a system prompt with triage rules:

import { generateText, stepCountIs } from 'ai';
const { text, steps } = await generateText({
  model: openai('gpt-5.5'),
  system: `You are a billing support triage agent.
- Look up customer and invoice before recommending refunds.
- Search the knowledge base for policy guidance.
- Create a ticket when you can resolve within policy.
- Call escalateToHuman when manual review is required.`,
  tools: {
    getCustomer,
    getInvoice,
    searchKnowledgeBase,
    createSupportTicket,
    escalateToHuman,
  },
  stopWhen: stepCountIs(8),
  prompt:
    'Customer cus_1042 says they were charged twice for invoice inv_8891. What should we do?',
});
console.log('steps:', steps.length);
console.log(text);

Use a model that supports tool calling (same requirement as web search in the Vercel AI SDK post).

`stopWhen` - when the loop stops

stopWhen defines stopping conditions for the tool loop. Conditions are evaluated only when the last step contains tool results.

A single condition stops when that condition returns true
An array stops when any condition returns true (OR logic)
Without stopWhen, the SDK applies stepCountIs(20)

The loop also ends naturally when the model returns text without further tool calls.

`stepCountIs` - cap the number of steps

stepCountIs(n) stops once steps.length reaches n. Use it on every production agent to prevent runaway loops and unbounded API cost.

Use case	Suggested cap
Single tool, then answer	2 (tool step + text step)
Chat with occasional tool use	3-5
Task agents (triage, research)	8-15
Long autonomous workflows	15-20 (with monitoring)

Tight vs relaxed cap on the same prompt:

import { generateText, stepCountIs } from 'ai';
// Stops after 3 steps even if the model still wants more context
const capped = await generateText({
  model: openai('gpt-5.5'),
  tools: supportTools,
  stopWhen: stepCountIs(3),
  prompt: '...',
});
// Allows a fuller investigation chain
const relaxed = await generateText({
  model: openai('gpt-5.5'),
  tools: supportTools,
  stopWhen: stepCountIs(8),
  prompt: '...',
});

Combining `hasToolCall` with `stepCountIs`

hasToolCall('toolName') stops when the model invokes a specific tool in the latest step. Pair it with stepCountIs for a hard cap plus a sentinel tool:

import { generateText, stepCountIs, hasToolCall } from 'ai';
const { text, steps } = await generateText({
  model: openai('gpt-5.5'),
  system: TRIAGE_INSTRUCTIONS,
  tools: supportTools,
  stopWhen: [stepCountIs(10), hasToolCall('escalateToHuman')],
  prompt:
    'Customer cus_2201 on the starter plan reports a duplicate $190 charge on invoice inv_9104.',
});

escalateToHuman works well as a sentinel: the loop stops as soon as the model decides the case needs a human, without waiting for a final text-only step.

Inspecting `steps` and usage

The steps array on the result contains per-step tool calls, tool results, finish reason, and usage. Use it for debugging and cost tracking:

const { text, steps, totalUsage } = await generateText({
  model: openai('gpt-5.5'),
  tools: supportTools,
  stopWhen: stepCountIs(8),
  prompt: '...',
});
for (const [index, step] of steps.entries()) {
  console.log(`step ${index + 1}`);
  console.log('  toolCalls:', step.toolCalls?.map((c) => c.toolName));
  console.log('  usage:', step.usage);
}
console.log('totalUsage:', totalUsage);

With streamText, pass onStepFinish to log each step as it completes.

`ToolLoopAgent` - reusable agent definition

ToolLoopAgent wraps the same loop for reuse across scripts and API routes. It accepts the same settings as generateText (tools, stopWhen, instructions).

import { ToolLoopAgent, stepCountIs } from 'ai';
const supportTriageAgent = new ToolLoopAgent({
  model: openai('gpt-5.5'),
  instructions: TRIAGE_INSTRUCTIONS,
  tools: supportTools,
  stopWhen: stepCountIs(8),
});
const result = await supportTriageAgent.generate({
  prompt:
    'Customer cus_1042 says they were charged twice for invoice inv_8891. What should we do?',
  onStepFinish: async ({ stepNumber, usage, toolCalls }) => {
    console.log(`step ${stepNumber + 1}`, {
      tokens: usage.totalTokens,
      tools: toolCalls?.map((call) => call.toolName),
    });
  },
});
console.log(result.text);

Use .stream() for streaming. For Next.js chat UIs, see createAgentUIStreamResponse in the AI SDK agents docs.

Streaming with tools

streamText supports the same tools and stopWhen settings:

import { streamText, stepCountIs } from 'ai';
const result = streamText({
  model: openai('gpt-5.5'),
  system: TRIAGE_INSTRUCTIONS,
  tools: supportTools,
  stopWhen: stepCountIs(8),
  prompt: 'Customer cus_1042 says they were charged twice for invoice inv_8891.',
  onStepFinish: async ({ stepNumber, toolCalls }) => {
    console.error(`step ${stepNumber + 1}:`, toolCalls?.map((c) => c.toolName));
  },
});
for await (const part of result.textStream) {
  process.stdout.write(part);
}

Text streams incrementally. Tool calls run between text segments as the loop progresses.

Production notes

Always set stopWhen - do not rely on the default stepCountIs(20) in production without monitoring
Cost - each step is another model call; log steps or onStepFinish usage
Tool errors - return structured errors from execute instead of throwing when the model should retry or escalate
Instructions - keep policy rules in system / instructions, not only in the user prompt
Same patterns elsewhere - PR review (listPRs → getChecks → submitReview) or job-fit scoring use the same loop mechanics with different tools

LLM integration with Vercel AI SDK

June 7, 2026

Large language models (LLMs) understand and generate text from prompts. The Vercel AI SDK is a provider-agnostic layer over LLM APIs - core functions are generateText, streamText, and embed. This post uses the OpenAI provider and mirrors the patterns from the OpenAI Responses API post.

For the lower-level openai npm package, see the Chat Completions API and Responses API posts. For multi-tool agents with stopWhen and stepCountIs, see the Building AI agents with Vercel AI SDK post. For the same triage scenario with the OpenAI Agents SDK, see the dedicated post.

Prerequisites

OpenAI account
Generated API key
Enabled billing
Node.js version 26
ai, @ai-sdk/openai, and zod installed (npm i ai @ai-sdk/openai zod)
For Markdown output: marked, dompurify, and jsdom (npm i marked dompurify jsdom)

Client setup

Create an OpenAI provider with your API key (read from the environment in production).

import { createOpenAI } from '@ai-sdk/openai';
const openai = createOpenAI({ apiKey: process.env.OPENAI_API_KEY });

For OpenRouter, use the dedicated @openrouter/ai-sdk-provider package - see the OpenRouter integration post. The same provider can target other hosts that implement a compatible API by setting baseURL and apiKey:

const openai = createOpenAI({
  apiKey: process.env.LLM_API_KEY,
  baseURL: 'https://your-gateway.example/v1',
});

Many third-party gateways support Chat Completions only. The examples below use openai(model) (Responses API path). If your provider does not support it, switch to openai.chat(model) and skip the web search example.

Basic integration

Pass a string as prompt and read text from the result.

import { generateText } from 'ai';
import { createOpenAI } from '@ai-sdk/openai';
const openai = createOpenAI({ apiKey: process.env.OPENAI_API_KEY });
const { text } = await generateText({
  model: openai('gpt-5.5'),
  prompt: 'Write a one-sentence bedtime story about a unicorn.',
});
console.log(text);

System prompt

Use the system parameter for stable behavior (tone, format, role). It takes precedence over casual wording in the user message.

const { text } = await generateText({
  model: openai('gpt-5.5'),
  system: 'Reply in one short sentence. Use plain language.',
  prompt: 'Explain what an LLM is.',
});
console.log(text);

Few-shot prompting

Pass prior turns in a messages array with user and assistant roles, then the new user message. Keep task rules in system.

const { text } = await generateText({
  model: openai('gpt-5.5'),
  system:
    'Classify sentiment as exactly one word: positive, negative, or neutral.',
  messages: [
    { role: 'user', content: 'I love this!' },
    { role: 'assistant', content: 'positive' },
    { role: 'user', content: 'This is awful.' },
    { role: 'assistant', content: 'negative' },
    { role: 'user', content: 'It is fine I guess.' },
  ],
});
console.log(text);

Streaming

Use streamText and iterate over textStream for incremental text.

import { streamText } from 'ai';
const result = streamText({
  model: openai('gpt-5.5'),
  prompt: 'List three colors.',
});
for await (const part of result.textStream) {
  process.stdout.write(part);
}

Structured output with JSON schema

Constrain the model to JSON matching your schema via Output.object() and a Zod schema. The SDK validates the result.

import { generateText, Output } from 'ai';
import { z } from 'zod';
const { output } = await generateText({
  model: openai('gpt-5.5'),
  prompt: 'The film Inception was directed by Christopher Nolan.',
  output: Output.object({
    schema: z.object({
      title: z.string(),
      director: z.string(),
    }),
    schemaName: 'movie_summary',
  }),
});
console.log(output.title, output.director);

Markdown output to HTML

Ask for Markdown in system, then convert text to HTML and sanitize before rendering (for example with innerHTML in the browser or when storing HTML).

import { marked } from 'marked';
import { JSDOM } from 'jsdom';
import DOMPurify from 'dompurify';
const purify = DOMPurify(new JSDOM('').window);
const { text } = await generateText({
  model: openai('gpt-5.5'),
  system: 'Reply in Markdown only. Use a heading and a short bullet list.',
  prompt: 'Explain what an LLM is in three bullet points.',
});
const markdown = text;
const html = marked.parse(markdown);
const safeHtml = purify.sanitize(html);

Always run DOMPurify.sanitize on model-generated HTML. The model can emit unsafe markup. Sanitization strips scripts and other dangerous content.

Web search tool

Enable the built-in web search tool when the answer should use current information from the web.

const result = await generateText({
  model: openai('gpt-5.5'),
  tools: { web_search: openai.tools.webSearch() },
  prompt: 'What was a major tech headline this week? Cite sources briefly.',
});
console.log(result.text);

Web search adds latency and tool usage cost. Use a model that supports tools.

Embeddings

Embeddings are numeric vectors that represent the semantic meaning of text. Use them for semantic search, clustering, and RAG.

Pass a single string to embed and read the vector from embedding.

import { embed } from 'ai';
const { embedding } = await embed({
  model: openai.embedding('text-embedding-3-small'),
  value: 'How do I connect pgvector to PostgreSQL?',
});
console.log(embedding.length);

Pass multiple strings in a values array with embedMany. Results are in the same order as the input.

import { embedMany } from 'ai';
const chunks = [
  'pgvector adds vector similarity search to PostgreSQL.',
  'LangChain helps split long documents into retrieval-friendly chunks.',
  'RAG retrieves context first, then asks an LLM to answer.',
];
const { embeddings } = await embedMany({
  model: openai.embedding('text-embedding-3-small'),
  values: chunks,
});
console.log(embeddings.length); // 3

For a full RAG flow with pgvector, see the RAG with OpenAI embeddings post. For LangChain basics (LCEL, Runnables, when to use LangChain), see the LangChain overview post.

2023

Deploying Next.js apps to Vercel

February 3, 2023

This post covers the main notes for deploying to Vercel and domain setup. Prerequisites Next.js app bootstrapped Deployment Add a new…

Building AI agents with Vercel AI SDK

Prerequisites

Mental model - steps and the tool loop

Support triage scenario

Defining multiple tools

Multi-step triage with generateText

stopWhen - when the loop stops

stepCountIs - cap the number of steps

Combining hasToolCall with stepCountIs

Inspecting steps and usage

ToolLoopAgent - reusable agent definition

Streaming with tools

Production notes

LLM integration with Vercel AI SDK

Prerequisites

Client setup

Basic integration

System prompt

Few-shot prompting

Streaming

Structured output with JSON schema

Markdown output to HTML

Web search tool

Embeddings

Deploying Next.js apps to Vercel

Multi-step triage with `generateText`

`stopWhen` - when the loop stops

`stepCountIs` - cap the number of steps

Combining `hasToolCall` with `stepCountIs`

Inspecting `steps` and usage

`ToolLoopAgent` - reusable agent definition