homeresume
 
   
🔍

Conversation memory for LangChain agents

June 18, 2026

This post extends the support triage agent from Building AI agents with LangChain into a multi-turn flow: turn 1 looks up the customer and invoice; turn 2 creates the ticket without the user repeating IDs. It is post #5 in the LangChain series, following the overview, loaders/chunking, RAG, and agents posts.

Prerequisites

  • OpenAI account
  • Generated API key
  • Enabled billing
  • Node.js version 26
  • Packages from the agents post, plus the checkpoint package:
npm i langchain @langchain/openai @langchain/core @langchain/langgraph-checkpoint zod
  • OPENAI_API_KEY set in the environment

Mental model

Three related concepts:

  • Checkpointer - short-term session memory. Saves messages and graph state after each step so the next invoke on the same thread can resume.
  • thread_id - conversation key passed in configurable. Same ID = same history; different ID = isolated session.
  • Store - long-term memory across threads (user preferences, facts learned over time). LangGraph stores are separate from checkpointers; this post focuses on checkpointers only.

Typical support flow with memory:

  1. Turn 1 - rep asks to look up cus_1042 and inv_8891; agent calls lookup tools and summarizes findings.
  2. Turn 2 - rep says "create the ticket we discussed"; agent recalls prior tool results and calls create_support_ticket.

MemorySaver

For demos and tests, use MemorySaver - an in-memory checkpointer that persists state for the lifetime of the process:

import { MemorySaver } from '@langchain/langgraph-checkpoint';
const checkpointer = new MemorySaver();

State is lost when the Node process exits. That is fine for local scripts; production apps need a durable backend (see below).

Attach a checkpointer to createAgent

Pass the checkpointer when creating the agent. Reuse the same triage tools and instructions from the agents post:

import { createAgent } from 'langchain';
import { MemorySaver } from '@langchain/langgraph-checkpoint';
const agent = createAgent({
model: 'gpt-5.5',
tools: supportTools,
systemPrompt: TRIAGE_INSTRUCTIONS,
checkpointer: new MemorySaver(),
});

The agent loop is unchanged - the checkpointer hooks into LangGraph beneath createAgent.

First turn - lookup

Pass a stable thread_id in the invoke config:

const threadConfig = { configurable: { thread_id: 'support-cus-1042' } };
const turn1 = await agent.invoke(
{
messages: [
{
role: 'user',
content:
'Look up customer cus_1042 and invoice inv_8891 for a possible duplicate charge. Summarize what you find. Do not create a ticket yet.',
},
],
},
threadConfig,
);
console.log(turn1.messages.at(-1)?.content);

The agent calls get_customer, get_invoice, and search_knowledge_base. LangGraph saves the full message history (including tool results) to the checkpointer.

Second turn - follow-up without IDs

Send only the new user message on the same thread_id. Prior context is restored automatically:

const turn2 = await agent.invoke(
{
messages: [
{
role: 'user',
content: 'Create the support ticket we discussed.',
},
],
},
threadConfig,
);
console.log(turn2.messages.at(-1)?.content);

The agent should call create_support_ticket using customer and invoice details from turn 1 - the user does not repeat cus_1042 or inv_8891.

Read the final answer from result.messages as in the agents post:

const lastAi = [...turn2.messages]
.reverse()
.find((message) => message.type === 'ai');
console.log(lastAi?.content);

Thread isolation

Different thread_id values do not share history. Two support reps working different cases should use separate thread IDs:

await agent.invoke(
{ messages: [{ role: 'user', content: 'Look up cus_1042.' }] },
{ configurable: { thread_id: 'rep-alice-case-1' } },
);
await agent.invoke(
{ messages: [{ role: 'user', content: 'Create the ticket we discussed.' }] },
{ configurable: { thread_id: 'rep-bob-case-2' } },
);

The second invoke on rep-bob-case-2 has no knowledge of Alice's lookup - Bob's thread starts empty.

Production checkpointers

MemorySaver is process-local and not suitable for production. LangGraph supports durable checkpointers backed by Postgres, SQLite, and other stores via @langchain/langgraph-checkpoint integrations. Swap the checkpointer implementation; the thread_id API stays the same.

Pick a backend that matches your deployment: Postgres for multi-instance apps, SQLite for single-node services.

Demo

See the langchain-agent-memory-nodejs-demo folder for multi-turn triage and thread-isolation scripts.

Building AI agents with LangChain

June 17, 2026

LangChain agents are built on LangGraph: the model calls tools in a loop until it returns a final answer. The high-level entry point is createAgent - pass a model, tools defined with tool(), and an optional systemPrompt.

This post builds the same support triage agent as the Vercel AI SDK agents and OpenAI Agents SDK posts so you can compare SDKs on one scenario. It follows the LangChain overview for Node.js and fits as post #4 in the LangChain series (after loaders/chunking and the RAG with pgvector pipeline).

Prerequisites

  • OpenAI account
  • Generated API key
  • Enabled billing
  • Node.js version 26
  • langchain, @langchain/openai, @langchain/core, and zod installed:
npm i langchain @langchain/openai @langchain/core zod
  • OPENAI_API_KEY set in the environment

Mental model - turns and the agent loop

A turn is one model generation. In that turn the model either:

  • returns final text (the run ends), or
  • returns tool calls (LangChain executes them and starts another turn with the results)

Typical flow for the support triage agent: user question → model calls lookup tools (get_customer, get_invoice, search_knowledge_base) → model creates a ticket or escalates → final answer.

A single turn can include multiple parallel tool calls. Set recursionLimit on invoke or stream to cap how many graph steps run (each model generation and tool batch counts toward the limit).

Defining tools

Use tool() from langchain with a Zod schema, plus name and description so the model knows when to call each tool:

import { tool } from 'langchain';
import { z } from 'zod';
const getInvoice = tool(
async ({ invoiceId }) => {
const invoice = invoices.find((item) => item.id === invoiceId);
if (!invoice) {
return { found: false, invoiceId, error: 'Invoice not found' };
}
return { found: true, invoice };
},
{
name: 'get_invoice',
description: 'Look up an invoice by ID, including payment IDs and status',
schema: z.object({
invoiceId: z.string().describe('Invoice ID, e.g. inv_8891'),
}),
},
);

LangChain uses schema (not Vercel's inputSchema or OpenAI Agents' parameters). The handler receives validated input as the first argument.

createAgent

Wire the model, tools, and triage instructions:

import { createAgent } from 'langchain';
const agent = createAgent({
model: 'gpt-5.5',
tools: [getInvoice],
systemPrompt: `You are a billing support triage agent.
Look up records before recommending refunds or creating tickets.`,
});

model can be a provider string ('gpt-5.5', 'openai:gpt-5.5') or a chat model instance from @langchain/openai.

Invoke

Pass a messages array and read the final answer from result.messages:

const result = await agent.invoke({
messages: [
{
role: 'user',
content: 'What is the status of invoice inv_8891? Reply in one sentence.',
},
],
});
const lastAi = [...result.messages]
.reverse()
.find((message) => message.type === 'ai');
console.log(lastAi?.content);

The last AI message is the agent's final reply after any tool calls complete.

Support triage scenario

Example prompt:

Customer cus_1042 says they were charged twice for invoice inv_8891. What should we do?

A realistic chain:

  1. get_customer - plan tier, open ticket count
  2. get_invoice - amount, status, payment IDs
  3. search_knowledge_base - duplicate-charge and refund policy
  4. create_support_ticket or escalate_to_human - write action or escalation

The demo uses in-memory fixtures (customers, invoices, knowledge-base articles) so scripts run without a database.

Multi-tool agent

Register all triage tools on one agent:

import { createAgent } from 'langchain';
import {
getCustomer,
getInvoice,
searchKnowledgeBase,
createSupportTicket,
escalateToHuman,
TRIAGE_INSTRUCTIONS,
} from './tools/index.js';
const agent = createAgent({
model: 'gpt-5.5',
tools: [
getCustomer,
getInvoice,
searchKnowledgeBase,
createSupportTicket,
escalateToHuman,
],
systemPrompt: TRIAGE_INSTRUCTIONS,
});
const result = await agent.invoke({
messages: [
{
role: 'user',
content:
'Customer cus_1042 says they were charged twice for invoice inv_8891. What should we do?',
},
],
recursionLimit: 15,
});
const answer = [...result.messages]
.reverse()
.find((message) => message.type === 'ai');
console.log(answer?.content);

Inspect result.messages for the full trace: human input, AI tool-call messages, tool results, and the final AI reply.

Streaming

agent.stream() yields state updates as the graph runs. Use streamMode: 'values' to receive the full message list after each step:

const stream = await agent.stream(
{
messages: [
{
role: 'user',
content:
'Customer cus_1042 says they were charged twice for invoice inv_8891. What should we do?',
},
],
},
{ streamMode: 'values', recursionLimit: 15 },
);
let finalMessages = [];
for await (const state of stream) {
if (state.messages) {
finalMessages = state.messages;
}
}
const answer = [...finalMessages]
.reverse()
.find((message) => message.type === 'ai');
console.log(answer?.content);

For token-level streaming, use streamMode: 'messages' or streamEvents (see LangGraph streaming).

When to pick LangChain

LangChain createAgentVercel AI SDKOpenAI Agents SDK
Best forRAG + LCEL + agents in one stackTypeScript apps already on AI SDKOpenAI-first agent primitives
Tool definitiontool() + Zod schematool() + inputSchematool() + Zod parameters
Run APIagent.invoke / agent.streamgenerateText + stopWhenrun() + maxTurns
Handoffs / guardrailsMiddleware (advanced)LimitedBuilt-in
MemoryLangGraph checkpointersBring your ownSession helpers

Pick LangChain when document loaders, retrievers, and agents should share one ecosystem. Pick Vercel AI SDK or OpenAI Agents SDK when you want a focused agent layer without the broader LangChain surface.

Demo

See the langchain-agents-nodejs-demo folder for runnable scripts: single-tool lookup, full triage, and streaming.

LangChain overview for Node.js

June 15, 2026

LangChain.js is a framework for LLM applications in TypeScript and Node.js. It standardizes how you wire prompts, models, tools, document loaders, embeddings, and retrievers into reusable pipelines and agents.

LangChain, Deep Agents, LangGraph, and LangSmith

ProjectRole
LangChainHigh-level APIs: LCEL chains, createAgent, loaders, retrievers
Deep AgentsBatteries-included agent harness: planning, subagents, filesystem, context management
LangGraphLow-level orchestration; LangChain agents run on LangGraph under the hood
LangSmithTracing, debugging, and evaluation for LangChain and LangGraph apps

Use Deep Agents for complex multi-step tasks out of the box. Use LangChain's createAgent when you want a minimal harness you compose with middleware. Reach for LangGraph when you need custom stateful workflows, branching, or fine-grained control over the agent loop.

Packages

Install the core packages first (install guide):

npm i langchain @langchain/core @langchain/openai zod

Provider-specific integrations live in separate packages:

  • langchain - createAgent, tool, and high-level chain helpers
  • zod - tool input schemas when defining tools with tool()
  • @langchain/core - prompts, output parsers, Runnable interface, LCEL
  • @langchain/openai - ChatOpenAI, OpenAIEmbeddings
  • @langchain/textsplitters - document chunking (used in the RAG post)
  • Standalone integration packages for other providers and tools (see the integrations page)

For raw API access, see the Chat Completions and OpenAI Responses API posts. For provider-agnostic text and agents, see the Vercel AI SDK and OpenAI Agents SDK posts.

When to use LangChain

ToolBest for
Raw openai packageMinimal calls, full control, least abstraction
Vercel AI SDKProvider-agnostic generateText, streaming, embeddings, tool loops
OpenAI Agents SDKOfficial agent loop, handoffs, guardrails
LangChainDocument ingestion, retrievers, LCEL chains, createAgent, swappable vector stores

Reach for LangChain when RAG or multi-step LLM pipelines grow beyond a few manual API calls.

Prerequisites

  • OpenAI account
  • Generated API key
  • Enabled billing
  • Node.js version 26
  • langchain, @langchain/core, @langchain/openai, and zod installed
  • OPENAI_API_KEY set in the environment

Core concepts

Document - a chunk of text with optional metadata. Loaders produce Document instances; splitters break long sources into retrieval-friendly pieces.

import { Document } from '@langchain/core/documents';
const doc = new Document({
pageContent: 'LangChain helps compose LLM pipelines.',
metadata: { source: 'intro' }
});

Runnable - any component with .invoke(), .stream(), or .batch(). Prompts, models, parsers, and composed chains are all Runnables.

LCEL (LangChain Expression Language) - chain Runnables with .pipe(). Data flows left to right: prompt → model → parser. The same .invoke(), .stream(), and .batch() interface applies to every Runnable in the chain.

import { ChatPromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';
import { ChatOpenAI } from '@langchain/openai';
const prompt = ChatPromptTemplate.fromMessages([
['system', 'Answer in one sentence.'],
['human', '{question}']
]);
const model = new ChatOpenAI({ model: 'gpt-5.5' });
const chain = prompt.pipe(model).pipe(new StringOutputParser());
const answer = await chain.invoke({ question: 'What is LangChain?' });
console.log(answer);

Agents - LangChain's current high-level agent API is createAgent. Pass a model string or chat model, optional tools (with zod schemas), and an optional checkpointer for conversation memory (@langchain/langgraph). For tools and the support triage scenario, see the agents post.

import { createAgent } from 'langchain';
const agent = createAgent({
model: 'gpt-5.5',
tools: []
});
const result = await agent.invoke({
messages: [{ role: 'user', content: 'What is LangChain?' }]
});

Structured output - return typed JSON instead of free text. In LCEL chains, call .withStructuredOutput() on a chat model with a Zod schema:

import { z } from 'zod';
import { ChatPromptTemplate } from '@langchain/core/prompts';
import { ChatOpenAI } from '@langchain/openai';
const schema = z.object({
answer: z.string(),
confidence: z.number(),
});
const prompt = ChatPromptTemplate.fromMessages([
['system', 'Answer briefly and rate your confidence from 0 to 1.'],
['human', '{question}'],
]);
const model = new ChatOpenAI({ model: 'gpt-5.5' }).withStructuredOutput(schema);
const result = await prompt.pipe(model).invoke({ question: 'What is LangChain?' });
console.log(result);

On agents, pass the same schema as responseFormat and read result.structuredResponse:

import { createAgent } from 'langchain';
import { z } from 'zod';
const schema = z.object({ answer: z.string(), confidence: z.number() });
const agent = createAgent({
model: 'gpt-5.5',
tools: [],
responseFormat: schema,
});
const result = await agent.invoke({
messages: [{ role: 'user', content: 'What is LangChain?' }],
});
console.log(result.structuredResponse);

What LangChain can do

  • Load and split documents - file and directory loaders, text splitters (see the loaders and chunking post); PDF, HTML, CSV via integration packages
  • Embeddings and vector stores - OpenAI embeddings with pgvector, Pinecone, Chroma, and others
  • Retrievers and RAG chains - fetch relevant context, then call a model (see the RAG with pgvector post)
  • Conversation memory - short-term memory via @langchain/langgraph checkpointers and thread_id (see the agent memory post); long-term memory via stores
  • Tools and agents - createAgent with tools and middleware; for production agents you may also prefer the Vercel AI SDK agents post or OpenAI Agents SDK post
  • Structured output - Zod schemas via .withStructuredOutput() on a chat model or responseFormat on createAgent; read parsed objects from the chain result or result.structuredResponse
  • Observability - trace runs with LangSmith (LANGSMITH_TRACING=true); optional LangSmith Engine monitors traces and flags issues

Streaming and batch

The same LCEL chain supports streaming and batch invocation:

for await (const chunk of await chain.stream({ question: 'What is LCEL?' })) {
process.stdout.write(chunk);
}
const answers = await chain.batch([
{ question: 'What is a Runnable?' },
{ question: 'What is a retriever?' }
]);

Demo

Runnable LCEL scripts for this post live in the langchain-overview-nodejs-demo folder. Get access via code demos.

Building AI agents with OpenAI Agents SDK

June 12, 2026

The OpenAI Agents SDK (@openai/agents) is OpenAI's official framework for agentic apps in TypeScript. It provides a small set of primitives: Agent, tools, handoffs, guardrails, and a run loop managed by run().

This post builds the same support triage agent as the Building AI agents with Vercel AI SDK post - lookup customers and invoices, search a knowledge base, then create a ticket or escalate - but uses the OpenAI SDK instead of the Vercel tool loop.

For lower-level API access, see the OpenAI Responses API post. For the Vercel AI SDK alternative (generateText, stopWhen, stepCountIs), see the Vercel AI SDK agents post. For the same scenario with LangChain (createAgent, tool(), agent.invoke), see Building AI agents with LangChain.

Prerequisites

  • OpenAI account
  • Generated API key
  • Enabled billing
  • Node.js version 26
  • @openai/agents and zod installed (npm i @openai/agents zod)
  • OPENAI_API_KEY set in the environment

Mental model - turns and the agent loop

A turn is one model generation. In that turn the model either:

  • returns final output (the run ends), or
  • returns tool calls (the SDK executes them and starts another turn with the results), or
  • requests a handoff to another agent (control switches, history is preserved, loop continues)

Typical flow for the support triage agent: user question → model calls lookup tools (get_customer, get_invoice, search_knowledge_base) → model creates a ticket or escalates → final answer.

maxTurns: 8 means “stop after 8 turns” (eight model generations), not eight individual tool calls. A single turn can include multiple parallel tool calls.

When you omit maxTurns, the SDK defaults to 10 as a safety cap.

Support triage scenario

Example prompt:

Customer cus_1042 says they were charged twice for invoice inv_8891. What should we do?

A realistic chain:

  1. get_customer - plan tier, open ticket count
  2. get_invoice - amount, status, payment IDs
  3. search_knowledge_base - duplicate-charge and refund policy
  4. create_support_ticket or escalate_to_human - write action or escalation

The demo uses in-memory fixtures (customers, invoices, knowledge-base articles) so scripts run without a database.

Defining multiple tools

Register tools with tool() and Zod parameters. Clear description values help the model pick the right tool.

import { tool } from '@openai/agents';
import { z } from 'zod';
const getCustomer = tool({
name: 'get_customer',
description: 'Look up a customer account by ID',
parameters: z.object({
customerId: z.string().describe('Customer ID, e.g. cus_1042'),
}),
execute: async ({ customerId }) => {
const customer = customers.find((item) => item.id === customerId);
if (!customer) {
return { found: false, customerId, error: 'Customer not found' };
}
return { found: true, customer };
},
});
const getInvoice = tool({
name: 'get_invoice',
description: 'Look up an invoice by ID, including payment IDs and status',
parameters: z.object({
invoiceId: z.string().describe('Invoice ID, e.g. inv_8891'),
}),
execute: async ({ invoiceId }) => {
const invoice = invoices.find((item) => item.id === invoiceId);
if (!invoice) {
return { found: false, invoiceId, error: 'Invoice not found' };
}
return { found: true, invoice };
},
});
const searchKnowledgeBase = tool({
name: 'search_knowledge_base',
description: 'Search internal support articles by keyword',
parameters: z.object({
query: z.string().describe('Search terms, e.g. duplicate charge refund'),
}),
execute: async ({ query }) => {
// keyword match against mocked articles
return { query, articles: matches };
},
});

Add write tools for outcomes:

const createSupportTicket = tool({
name: 'create_support_ticket',
description: 'Create a support ticket after gathering customer and policy context',
parameters: z.object({
customerId: z.string(),
subject: z.string().min(3),
priority: z.enum(['low', 'medium', 'high']),
summary: z.string().min(10),
}),
execute: async (input) => {
const ticket = createTicket(input);
return { created: true, ticket };
},
});
const escalateToHuman = tool({
name: 'escalate_to_human',
description: 'Escalate when policy requires manual review',
parameters: z.object({
customerId: z.string(),
reason: z.string().min(10),
urgency: z.enum(['normal', 'high']),
}),
execute: async (input) => ({
escalated: true,
queue: input.urgency === 'high' ? 'billing-urgent' : 'billing-standard',
...input,
}),
});

Return structured objects from execute. The SDK serializes them as tool results for the next turn. Return explicit errors (for example { found: false, error: '...' }) so the model can recover instead of throwing.

Running an agent

Define an Agent with instructions and tools, then call run():

import { Agent, run } from '@openai/agents';
const agent = new Agent({
name: 'Support Triage',
model: 'gpt-5.5',
instructions: `You are a billing support triage agent.
- Look up customer and invoice before recommending refunds.
- Search the knowledge base for policy guidance.
- Create a ticket when you can resolve within policy.
- Call escalate_to_human when manual review is required.`,
tools: [
getCustomer,
getInvoice,
searchKnowledgeBase,
createSupportTicket,
escalateToHuman,
],
});
const result = await run(
agent,
'Customer cus_1042 says they were charged twice for invoice inv_8891. What should we do?',
{ maxTurns: 8 },
);
console.log(result.finalOutput);

Use a model that supports tool calling.

maxTurns - cap the number of turns

maxTurns(n) stops once the run reaches n turns. Use it on every production agent to prevent runaway loops and unbounded API cost. When the cap is exceeded, the SDK throws MaxTurnsExceededError.

Use caseSuggested cap
Single tool, then answer2
Chat with occasional tool use3–5
Task agents (triage, research)8–15
Long autonomous workflows15–20 (with monitoring)

Tight vs relaxed cap on the same prompt:

import { Agent, run } from '@openai/agents';
// Stops after 3 turns even if the model still wants more context
const tight = await run(agent, prompt, { maxTurns: 3 });
// Allows a fuller investigation chain
const relaxed = await run(agent, prompt, { maxTurns: 8 });

Inspecting runs

The newItems array on the result contains tool calls, tool outputs, and messages from the run. Use it for debugging:

const result = await run(agent, prompt, { maxTurns: 8 });
for (const item of result.newItems) {
if (item.type === 'tool_call_item') {
console.log('tool:', item.rawItem.name, item.rawItem.arguments);
}
if (item.type === 'tool_call_output_item') {
console.log('output:', item.output);
}
}
console.log('lastAgent:', result.lastAgent.name);
console.log('answer:', result.finalOutput);

The SDK emits traces automatically. Set workflowName on a custom Runner to group related runs in the OpenAI Traces dashboard.

Handoffs

For multi-agent workflows, define specialist agents and wire them with Agent.create() and handoffs. The triage agent in this post stays single-agent, but handoffs are the SDK's way to delegate between agents (similar to routing a case to a billing specialist):

import { Agent } from '@openai/agents';
const billingAgent = new Agent({
name: 'Billing Specialist',
instructions: 'Handle refund and duplicate-charge cases.',
tools: [getInvoice, searchKnowledgeBase, createSupportTicket],
});
const triageAgent = Agent.create({
name: 'Triage',
instructions: 'Route billing cases to the billing specialist when needed.',
handoffs: [billingAgent],
});

After a run, check result.lastAgent to see which agent produced the final output.

Streaming

Pass stream: true to receive events as the run progresses:

import { Agent, run } from '@openai/agents';
const stream = await run(agent, prompt, { maxTurns: 8, stream: true });
process.stdout.write('Answer: ');
for await (const event of stream) {
if (event.type === 'raw_model_stream_event' && event.data.type === 'output_text_delta') {
process.stdout.write(event.data.delta);
}
if (event.type === 'run_item_stream_event' && event.name === 'tool_called') {
console.error(`\nTool: ${event.item.rawItem.name}`);
}
}
await stream.completed;
console.log('\nDone:', stream.finalOutput);

Text streams incrementally. Tool calls appear as run_item_stream_event events between text segments.

Production notes

  • Always set maxTurns - do not rely on the default cap without monitoring
  • Cost - each turn is another model call; inspect newItems or stream events for tool usage
  • Tool errors - return structured errors from execute instead of throwing when the model should retry or escalate
  • Instructions - keep policy rules in instructions, not only in the user prompt
  • Tracing - use the OpenAI Traces dashboard to debug multi-turn runs
  • Alternatives - hosted tools (web search, code interpreter), MCP servers, and sandbox agents are covered in the official docs

Demo

Runnable scripts for each section live in the openai-agents-sdk-demo folder. Get access via code demos.

Building AI agents with Vercel AI SDK

June 9, 2026

The Vercel AI SDK treats agents as tool-calling loops: the model generates text or invokes tools, the SDK runs those tools, and the loop continues until the model answers or a stop condition fires.

This post builds a support triage agent that looks up customers and invoices, searches an internal knowledge base, and either opens a ticket or escalates to a human. It builds on the LLM integration with Vercel AI SDK post and focuses on multiple tools, stopWhen, and stepCountIs.

For external tools exposed over MCP instead of SDK-native tool() handlers, see the MCP server with Node.js post. For the same triage scenario with the official OpenAI Agents SDK (@openai/agents, run(), maxTurns), see the dedicated post. For the LangChain stack (createAgent, tool(), LangGraph loop), see Building AI agents with LangChain.

Prerequisites

  • OpenAI account
  • Generated API key
  • Enabled billing
  • Node.js version 26
  • ai, @ai-sdk/openai, and zod installed (npm i ai @ai-sdk/openai zod)
  • Client setup from the Vercel AI SDK integration post

Mental model - steps and the tool loop

A step is one model generation. In that step the model either:

  • returns text (the loop ends), or
  • returns tool calls (the SDK executes them and starts another step with the results)

Typical flow for the support triage agent: user question → model calls lookup tools (getCustomer, getInvoice, searchKnowledgeBase) → model creates a ticket or escalates → final answer. stopWhen can end the loop before or after the write tools run.

stepCountIs(5) means "stop after 5 steps" (five model generations), not five individual tool calls. A single step can include multiple parallel tool calls.

When you pass tools without stopWhen, the SDK defaults to stepCountIs(20) as a safety cap.

Support triage scenario

Example prompt:

Customer cus_1042 says they were charged twice for invoice inv_8891. What should we do?

A realistic chain:

  1. getCustomer - plan tier, open ticket count
  2. getInvoice - amount, status, payment IDs
  3. searchKnowledgeBase - duplicate-charge and refund policy
  4. createSupportTicket or escalateToHuman - write action or sentinel stop

The demo uses in-memory fixtures (customers, invoices, knowledge-base articles) so scripts run without a database.

Defining multiple tools

Register tools with tool() and Zod inputSchema. Clear description values help the model pick the right tool.

import { tool } from 'ai';
import { z } from 'zod';
const getCustomer = tool({
description: 'Look up a customer account by ID',
inputSchema: z.object({
customerId: z.string().describe('Customer ID, e.g. cus_1042'),
}),
execute: async ({ customerId }) => {
const customer = customers.find((item) => item.id === customerId);
if (!customer) {
return { found: false, customerId, error: 'Customer not found' };
}
return { found: true, customer };
},
});
const getInvoice = tool({
description: 'Look up an invoice by ID, including payment IDs and status',
inputSchema: z.object({
invoiceId: z.string().describe('Invoice ID, e.g. inv_8891'),
}),
execute: async ({ invoiceId }) => {
const invoice = invoices.find((item) => item.id === invoiceId);
if (!invoice) {
return { found: false, invoiceId, error: 'Invoice not found' };
}
return { found: true, invoice };
},
});
const searchKnowledgeBase = tool({
description: 'Search internal support articles by keyword',
inputSchema: z.object({
query: z.string().describe('Search terms, e.g. duplicate charge refund'),
}),
execute: async ({ query }) => {
// keyword match against mocked articles
return { query, articles: matches };
},
});

Add write tools for outcomes:

const createSupportTicket = tool({
description: 'Create a support ticket after gathering customer and policy context',
inputSchema: z.object({
customerId: z.string(),
subject: z.string().min(3),
priority: z.enum(['low', 'medium', 'high']),
summary: z.string().min(10),
}),
execute: async (input) => {
const ticket = createTicket(input);
return { created: true, ticket };
},
});
const escalateToHuman = tool({
description: 'Escalate when policy requires manual review',
inputSchema: z.object({
customerId: z.string(),
reason: z.string().min(10),
urgency: z.enum(['normal', 'high']),
}),
execute: async (input) => ({
escalated: true,
queue: input.urgency === 'high' ? 'billing-urgent' : 'billing-standard',
...input,
}),
});

Return structured objects from execute. The SDK serializes them as tool results for the next step. Return explicit errors (for example { found: false, error: '...' }) so the model can recover instead of throwing.

Multi-step triage with generateText

Pass all tools and a system prompt with triage rules:

import { generateText, stepCountIs } from 'ai';
const { text, steps } = await generateText({
model: openai('gpt-5.5'),
system: `You are a billing support triage agent.
- Look up customer and invoice before recommending refunds.
- Search the knowledge base for policy guidance.
- Create a ticket when you can resolve within policy.
- Call escalateToHuman when manual review is required.`,
tools: {
getCustomer,
getInvoice,
searchKnowledgeBase,
createSupportTicket,
escalateToHuman,
},
stopWhen: stepCountIs(8),
prompt:
'Customer cus_1042 says they were charged twice for invoice inv_8891. What should we do?',
});
console.log('steps:', steps.length);
console.log(text);

Use a model that supports tool calling (same requirement as web search in the Vercel AI SDK post).

stopWhen - when the loop stops

stopWhen defines stopping conditions for the tool loop. Conditions are evaluated only when the last step contains tool results.

  • A single condition stops when that condition returns true
  • An array stops when any condition returns true (OR logic)
  • Without stopWhen, the SDK applies stepCountIs(20)

The loop also ends naturally when the model returns text without further tool calls.

stepCountIs - cap the number of steps

stepCountIs(n) stops once steps.length reaches n. Use it on every production agent to prevent runaway loops and unbounded API cost.

Use caseSuggested cap
Single tool, then answer2 (tool step + text step)
Chat with occasional tool use3-5
Task agents (triage, research)8-15
Long autonomous workflows15-20 (with monitoring)

Tight vs relaxed cap on the same prompt:

import { generateText, stepCountIs } from 'ai';
// Stops after 3 steps even if the model still wants more context
const capped = await generateText({
model: openai('gpt-5.5'),
tools: supportTools,
stopWhen: stepCountIs(3),
prompt: '...',
});
// Allows a fuller investigation chain
const relaxed = await generateText({
model: openai('gpt-5.5'),
tools: supportTools,
stopWhen: stepCountIs(8),
prompt: '...',
});

Combining hasToolCall with stepCountIs

hasToolCall('toolName') stops when the model invokes a specific tool in the latest step. Pair it with stepCountIs for a hard cap plus a sentinel tool:

import { generateText, stepCountIs, hasToolCall } from 'ai';
const { text, steps } = await generateText({
model: openai('gpt-5.5'),
system: TRIAGE_INSTRUCTIONS,
tools: supportTools,
stopWhen: [stepCountIs(10), hasToolCall('escalateToHuman')],
prompt:
'Customer cus_2201 on the starter plan reports a duplicate $190 charge on invoice inv_9104.',
});

escalateToHuman works well as a sentinel: the loop stops as soon as the model decides the case needs a human, without waiting for a final text-only step.

Inspecting steps and usage

The steps array on the result contains per-step tool calls, tool results, finish reason, and usage. Use it for debugging and cost tracking:

const { text, steps, totalUsage } = await generateText({
model: openai('gpt-5.5'),
tools: supportTools,
stopWhen: stepCountIs(8),
prompt: '...',
});
for (const [index, step] of steps.entries()) {
console.log(`step ${index + 1}`);
console.log(' toolCalls:', step.toolCalls?.map((c) => c.toolName));
console.log(' usage:', step.usage);
}
console.log('totalUsage:', totalUsage);

With streamText, pass onStepFinish to log each step as it completes.

ToolLoopAgent - reusable agent definition

ToolLoopAgent wraps the same loop for reuse across scripts and API routes. It accepts the same settings as generateText (tools, stopWhen, instructions).

import { ToolLoopAgent, stepCountIs } from 'ai';
const supportTriageAgent = new ToolLoopAgent({
model: openai('gpt-5.5'),
instructions: TRIAGE_INSTRUCTIONS,
tools: supportTools,
stopWhen: stepCountIs(8),
});
const result = await supportTriageAgent.generate({
prompt:
'Customer cus_1042 says they were charged twice for invoice inv_8891. What should we do?',
onStepFinish: async ({ stepNumber, usage, toolCalls }) => {
console.log(`step ${stepNumber + 1}`, {
tokens: usage.totalTokens,
tools: toolCalls?.map((call) => call.toolName),
});
},
});
console.log(result.text);

Use .stream() for streaming. For Next.js chat UIs, see createAgentUIStreamResponse in the AI SDK agents docs.

Streaming with tools

streamText supports the same tools and stopWhen settings:

import { streamText, stepCountIs } from 'ai';
const result = streamText({
model: openai('gpt-5.5'),
system: TRIAGE_INSTRUCTIONS,
tools: supportTools,
stopWhen: stepCountIs(8),
prompt: 'Customer cus_1042 says they were charged twice for invoice inv_8891.',
onStepFinish: async ({ stepNumber, toolCalls }) => {
console.error(`step ${stepNumber + 1}:`, toolCalls?.map((c) => c.toolName));
},
});
for await (const part of result.textStream) {
process.stdout.write(part);
}

Text streams incrementally. Tool calls run between text segments as the loop progresses.

Production notes

  • Always set stopWhen - do not rely on the default stepCountIs(20) in production without monitoring
  • Cost - each step is another model call; log steps or onStepFinish usage
  • Tool errors - return structured errors from execute instead of throwing when the model should retry or escalate
  • Instructions - keep policy rules in system / instructions, not only in the user prompt
  • Same patterns elsewhere - PR review (listPRsgetCheckssubmitReview) or job-fit scoring use the same loop mechanics with different tools

Demo

Runnable scripts for each section live in the vercel-ai-sdk-agents-demo folder. Get access via code demos.

LLM integration with OpenRouter

June 8, 2026

OpenRouter is a unified API gateway to hundreds of language models from providers such as OpenAI, Anthropic, Google, and Meta. You use one API key and one billing surface, and swap models by changing a provider/model slug. OpenRouter exposes a Chat Completions-compatible HTTP API.

This post shows three Node.js integration paths: the official @openrouter/sdk, the openai package with baseURL, and the Vercel AI SDK with @openrouter/ai-sdk-provider.

For deeper patterns on each stack, see the Chat Completions API, OpenAI Responses API (OpenAI direct only), and Vercel AI SDK posts.

Prerequisites

  • OpenRouter account
  • API key
  • Credits or billing enabled as needed
  • Node.js version 26
  • Install packages for the path you use:
    • @openrouter/sdk (npm i @openrouter/sdk)
    • openai (npm i openai)
    • ai and @openrouter/ai-sdk-provider (npm i ai @openrouter/ai-sdk-provider)

Configuration

Read credentials from the environment in production.

VariablePurpose
OPENROUTER_API_KEYBearer token from OpenRouter settings
OPENROUTER_MODELDefault model slug, for example openai/gpt-5.5
OPENROUTER_SITE_URLOptional site URL sent as HTTP-Referer for rankings on openrouter.ai
OPENROUTER_SITE_TITLEOptional app name sent as X-OpenRouter-Title

Model IDs use the provider/model format, for example openai/gpt-5.5, anthropic/claude-opus-4.8, or google/gemini-3.1-flash-lite. Browse the full catalog at openrouter.ai/models.

The examples below use openai/gpt-5.5, matching the model in the other LLM posts in this series. Override it with OPENROUTER_MODEL when you want a different model.

@openrouter/sdk

OpenRouter's official TypeScript SDK is type-safe and generated from the OpenAPI spec.

Client setup

import { OpenRouter } from '@openrouter/sdk';
const client = new OpenRouter({
apiKey: process.env.OPENROUTER_API_KEY,
httpReferer: process.env.OPENROUTER_SITE_URL,
appTitle: process.env.OPENROUTER_SITE_TITLE,
});

Basic integration

const response = await client.chat.send({
chatRequest: {
model: process.env.OPENROUTER_MODEL ?? 'openai/gpt-5.5',
messages: [
{ role: 'user', content: 'Write a one-sentence bedtime story about a unicorn.' },
],
},
});
console.log(response.choices[0].message.content);

System prompt

Add a system message before the user turn to set tone, format, and role.

const response = await client.chat.send({
chatRequest: {
model: process.env.OPENROUTER_MODEL ?? 'openai/gpt-5.5',
messages: [
{ role: 'system', content: 'Reply in one short sentence. Use plain language.' },
{ role: 'user', content: 'Explain what an LLM is.' },
],
},
});
console.log(response.choices[0].message.content);

Streaming

Set stream: true and read incremental text from choices[0].delta.content.

const stream = await client.chat.send({
chatRequest: {
model: process.env.OPENROUTER_MODEL ?? 'openai/gpt-5.5',
messages: [{ role: 'user', content: 'List three colors.' }],
stream: true,
},
});
process.stdout.write('[stream] ');
for await (const chunk of stream) {
const delta = chunk.choices[0]?.delta?.content;
if (delta) {
process.stdout.write(delta);
}
}
process.stdout.write('\n');

Model switching

Change only the model string to route the same code to a different provider.

const models = ['openai/gpt-5.5', 'google/gemini-3.1-flash-lite'];
for (const model of models) {
const response = await client.chat.send({
chatRequest: {
model,
messages: [{ role: 'user', content: 'Reply with exactly one word: ok.' }],
},
});
console.log(model, '->', response.choices[0].message.content);
}

openai package

If you already use the OpenAI SDK, point it at OpenRouter with baseURL. The request shape matches the Chat Completions API.

Client setup

import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.OPENROUTER_API_KEY,
baseURL: 'https://openrouter.ai/api/v1',
defaultHeaders: {
'HTTP-Referer': process.env.OPENROUTER_SITE_URL,
'X-OpenRouter-Title': process.env.OPENROUTER_SITE_TITLE,
},
});

Basic integration

const completion = await client.chat.completions.create({
model: process.env.OPENROUTER_MODEL ?? 'openai/gpt-5.5',
messages: [
{ role: 'user', content: 'Write a one-sentence bedtime story about a unicorn.' },
],
});
console.log(completion.choices[0].message.content);

System prompt

const completion = await client.chat.completions.create({
model: process.env.OPENROUTER_MODEL ?? 'openai/gpt-5.5',
messages: [
{ role: 'system', content: 'Reply in one short sentence. Use plain language.' },
{ role: 'user', content: 'Explain what an LLM is.' },
],
});
console.log(completion.choices[0].message.content);

Streaming

const stream = await client.chat.completions.create({
model: process.env.OPENROUTER_MODEL ?? 'openai/gpt-5.5',
messages: [{ role: 'user', content: 'List three colors.' }],
stream: true,
});
process.stdout.write('[stream] ');
for await (const chunk of stream) {
const delta = chunk.choices[0]?.delta?.content;
if (delta) {
process.stdout.write(delta);
}
}
process.stdout.write('\n');

For JSON schema output, Markdown-to-HTML, and few-shot prompting, reuse the patterns from the Chat Completions post with the OpenRouter client and model slug above.

Vercel AI SDK

The @openrouter/ai-sdk-provider package exposes OpenRouter models to generateText, streamText, and related helpers from the ai package. See the OpenRouter Vercel AI SDK guide for the full integration reference.

Client setup

import { createOpenRouter } from '@openrouter/ai-sdk-provider';
const openrouter = createOpenRouter({
apiKey: process.env.OPENROUTER_API_KEY,
appUrl: process.env.OPENROUTER_SITE_URL,
appName: process.env.OPENROUTER_SITE_TITLE,
});

The returned provider is callable. Pass a model slug directly: openrouter('openai/gpt-5.5').

Basic integration

import { generateText } from 'ai';
const { text } = await generateText({
model: openrouter(process.env.OPENROUTER_MODEL ?? 'openai/gpt-5.5'),
prompt: 'Write a one-sentence bedtime story about a unicorn.',
});
console.log(text);

System prompt

const { text } = await generateText({
model: openrouter(process.env.OPENROUTER_MODEL ?? 'openai/gpt-5.5'),
system: 'Reply in one short sentence. Use plain language.',
prompt: 'Explain what an LLM is.',
});
console.log(text);

Streaming

import { streamText } from 'ai';
const result = streamText({
model: openrouter(process.env.OPENROUTER_MODEL ?? 'openai/gpt-5.5'),
prompt: 'List three colors.',
});
process.stdout.write('[stream] ');
for await (const part of result.textStream) {
process.stdout.write(part);
}
process.stdout.write('\n');

For structured output, embeddings, and web search, see the Vercel AI SDK post. Those patterns apply when you call OpenAI directly; OpenRouter coverage depends on the model and endpoint.

Demo

Runnable scripts for each integration path live in the openrouter-demo folder. Get access via code demos.

LLM integration with Vercel AI SDK

June 7, 2026

Large language models (LLMs) understand and generate text from prompts. The Vercel AI SDK is a provider-agnostic layer over LLM APIs - core functions are generateText, streamText, and embed. This post uses the OpenAI provider and mirrors the patterns from the OpenAI Responses API post.

For the lower-level openai npm package, see the Chat Completions API and Responses API posts. For multi-tool agents with stopWhen and stepCountIs, see the Building AI agents with Vercel AI SDK post. For the same triage scenario with the OpenAI Agents SDK, see the dedicated post.

Prerequisites

  • OpenAI account
  • Generated API key
  • Enabled billing
  • Node.js version 26
  • ai, @ai-sdk/openai, and zod installed (npm i ai @ai-sdk/openai zod)
  • For Markdown output: marked, dompurify, and jsdom (npm i marked dompurify jsdom)

Client setup

Create an OpenAI provider with your API key (read from the environment in production).

import { createOpenAI } from '@ai-sdk/openai';
const openai = createOpenAI({ apiKey: process.env.OPENAI_API_KEY });

For OpenRouter, use the dedicated @openrouter/ai-sdk-provider package - see the OpenRouter integration post. The same provider can target other hosts that implement a compatible API by setting baseURL and apiKey:

const openai = createOpenAI({
apiKey: process.env.LLM_API_KEY,
baseURL: 'https://your-gateway.example/v1',
});

Many third-party gateways support Chat Completions only. The examples below use openai(model) (Responses API path). If your provider does not support it, switch to openai.chat(model) and skip the web search example.

Basic integration

Pass a string as prompt and read text from the result.

import { generateText } from 'ai';
import { createOpenAI } from '@ai-sdk/openai';
const openai = createOpenAI({ apiKey: process.env.OPENAI_API_KEY });
const { text } = await generateText({
model: openai('gpt-5.5'),
prompt: 'Write a one-sentence bedtime story about a unicorn.',
});
console.log(text);

System prompt

Use the system parameter for stable behavior (tone, format, role). It takes precedence over casual wording in the user message.

const { text } = await generateText({
model: openai('gpt-5.5'),
system: 'Reply in one short sentence. Use plain language.',
prompt: 'Explain what an LLM is.',
});
console.log(text);

Few-shot prompting

Pass prior turns in a messages array with user and assistant roles, then the new user message. Keep task rules in system.

const { text } = await generateText({
model: openai('gpt-5.5'),
system:
'Classify sentiment as exactly one word: positive, negative, or neutral.',
messages: [
{ role: 'user', content: 'I love this!' },
{ role: 'assistant', content: 'positive' },
{ role: 'user', content: 'This is awful.' },
{ role: 'assistant', content: 'negative' },
{ role: 'user', content: 'It is fine I guess.' },
],
});
console.log(text);

Streaming

Use streamText and iterate over textStream for incremental text.

import { streamText } from 'ai';
const result = streamText({
model: openai('gpt-5.5'),
prompt: 'List three colors.',
});
for await (const part of result.textStream) {
process.stdout.write(part);
}

Structured output with JSON schema

Constrain the model to JSON matching your schema via Output.object() and a Zod schema. The SDK validates the result.

import { generateText, Output } from 'ai';
import { z } from 'zod';
const { output } = await generateText({
model: openai('gpt-5.5'),
prompt: 'The film Inception was directed by Christopher Nolan.',
output: Output.object({
schema: z.object({
title: z.string(),
director: z.string(),
}),
schemaName: 'movie_summary',
}),
});
console.log(output.title, output.director);

Markdown output to HTML

Ask for Markdown in system, then convert text to HTML and sanitize before rendering (for example with innerHTML in the browser or when storing HTML).

import { marked } from 'marked';
import { JSDOM } from 'jsdom';
import DOMPurify from 'dompurify';
const purify = DOMPurify(new JSDOM('').window);
const { text } = await generateText({
model: openai('gpt-5.5'),
system: 'Reply in Markdown only. Use a heading and a short bullet list.',
prompt: 'Explain what an LLM is in three bullet points.',
});
const markdown = text;
const html = marked.parse(markdown);
const safeHtml = purify.sanitize(html);

Always run DOMPurify.sanitize on model-generated HTML. The model can emit unsafe markup. Sanitization strips scripts and other dangerous content.

Web search tool

Enable the built-in web search tool when the answer should use current information from the web.

const result = await generateText({
model: openai('gpt-5.5'),
tools: { web_search: openai.tools.webSearch() },
prompt: 'What was a major tech headline this week? Cite sources briefly.',
});
console.log(result.text);

Web search adds latency and tool usage cost. Use a model that supports tools.

Embeddings

Embeddings are numeric vectors that represent the semantic meaning of text. Use them for semantic search, clustering, and RAG.

Pass a single string to embed and read the vector from embedding.

import { embed } from 'ai';
const { embedding } = await embed({
model: openai.embedding('text-embedding-3-small'),
value: 'How do I connect pgvector to PostgreSQL?',
});
console.log(embedding.length);

Pass multiple strings in a values array with embedMany. Results are in the same order as the input.

import { embedMany } from 'ai';
const chunks = [
'pgvector adds vector similarity search to PostgreSQL.',
'LangChain helps split long documents into retrieval-friendly chunks.',
'RAG retrieves context first, then asks an LLM to answer.',
];
const { embeddings } = await embedMany({
model: openai.embedding('text-embedding-3-small'),
values: chunks,
});
console.log(embeddings.length); // 3

For a full RAG flow with pgvector, see the RAG with OpenAI embeddings post. For LangChain basics (LCEL, Runnables, when to use LangChain), see the LangChain overview post.

Demo

Runnable scripts for each section live in the vercel-ai-sdk-demo folder. Get access via code demos.

LLM integration with OpenAI Responses API

May 31, 2026

Large language models (LLMs) understand and generate text from prompts. OpenAI exposes models through the Responses API. The official openai npm package is the practical way to call it from Node.js. This post covers common patterns beyond a single prompt string.

For the Chat Completions API (messages, choices[0].message.content), see the dedicated post. For the same patterns with the Vercel AI SDK (generateText, streamText), see the dedicated post.

Prerequisites

  • OpenAI account
  • Generated API key
  • Enabled billing
  • Node.js version 26
  • openai package installed (npm i openai)
  • For Markdown output: marked, dompurify, and jsdom (npm i marked dompurify jsdom)

Client setup

Create a client with your API key (read from the environment in production).

import OpenAI from 'openai';
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

The same SDK can target other hosts that implement a compatible API by setting baseURL and apiKey:

const client = new OpenAI({
apiKey: process.env.LLM_API_KEY,
baseURL: 'https://your-gateway.example/v1',
});

Azure OpenAI uses AzureOpenAI instead. Many third-party gateways support Chat Completions only; the examples below use client.responses.*, so confirm your provider supports the Responses API (especially for tools like web search).

Basic integration

Pass a string as input and read output_text from the response.

const response = await client.responses.create({
model: 'gpt-5.5',
input: 'Write a one-sentence bedtime story about a unicorn.',
});
console.log(response.output_text);

System prompt

Use top-level instructions for stable behavior (tone, format, role). They take precedence over casual wording in the user message.

const response = await client.responses.create({
model: 'gpt-5.5',
instructions: 'Reply in one short sentence. Use plain language.',
input: 'Explain what an LLM is.',
});
console.log(response.output_text);

Few-shot prompting

Pass prior turns as an input array with user and assistant roles, then the new user message. Keep task rules in instructions.

const response = await client.responses.create({
model: 'gpt-5.5',
instructions:
'Classify sentiment as exactly one word: positive, negative, or neutral.',
input: [
{ role: 'user', content: 'I love this!' },
{ role: 'assistant', content: 'positive' },
{ role: 'user', content: 'This is awful.' },
{ role: 'assistant', content: 'negative' },
{ role: 'user', content: 'It is fine I guess.' },
],
});
console.log(response.output_text);

Streaming

Set stream: true and handle response.output_text.delta events for incremental text.

const stream = await client.responses.create({
model: 'gpt-5.5',
input: 'List three colors.',
stream: true,
});
for await (const event of stream) {
if (event.type === 'response.output_text.delta') {
process.stdout.write(event.delta);
}
}

Structured output with JSON schema

Constrain the model to JSON matching your schema via text.format. With strict: true, the output should match the schema.

const response = await client.responses.create({
model: 'gpt-5.5',
input: 'The film Inception was directed by Christopher Nolan.',
text: {
format: {
type: 'json_schema',
name: 'movie_summary',
strict: true,
schema: {
type: 'object',
properties: {
title: { type: 'string' },
director: { type: 'string' },
},
required: ['title', 'director'],
additionalProperties: false,
},
},
},
});
const data = JSON.parse(response.output_text);
console.log(data.title, data.director);

For typed parsing with Zod, you can use client.responses.parse() instead of JSON.parse.

Markdown output to HTML

Ask for Markdown in instructions, then convert output_text to HTML and sanitize before rendering (for example with innerHTML in the browser or when storing HTML).

import { marked } from 'marked';
import { JSDOM } from 'jsdom';
import DOMPurify from 'dompurify';
const purify = DOMPurify(new JSDOM('').window);
const response = await client.responses.create({
model: 'gpt-5.5',
instructions: 'Reply in Markdown only. Use a heading and a short bullet list.',
input: 'Explain what an LLM is in three bullet points.',
});
const markdown = response.output_text;
const html = marked.parse(markdown);
const safeHtml = purify.sanitize(html);

Always run DOMPurify.sanitize on model-generated HTML. The model can emit unsafe markup; sanitization strips scripts and other dangerous content.

Web search tool

Enable the built-in web search tool when the answer should use current information from the web.

const response = await client.responses.create({
model: 'gpt-5.5',
tools: [{ type: 'web_search' }],
include: ['web_search_call.action.sources'],
input: 'What was a major tech headline this week? Cite sources briefly.',
});
console.log(response.output_text);

Web search adds latency and tool usage cost. Use a model that supports tools.

Demo

Runnable scripts for each section live in the openai-responses-api-demo folder. Get access via code demos.