LLM integration with OpenAI Responses API
May 31, 2026Large language models (LLMs) understand and generate text from prompts. OpenAI exposes models through the Responses API. The official openai npm package is the practical way to call it from Node.js. This post covers common patterns beyond a single prompt string.
Prerequisites
- OpenAI account
- Generated API key
- Enabled billing
- Node.js version 26
openaipackage installed (npm i openai)
Client setup
Create a client with your API key (read from the environment in production).
import OpenAI from 'openai';const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
The same SDK can target other hosts that implement a compatible API by setting baseURL and apiKey:
const client = new OpenAI({apiKey: process.env.LLM_API_KEY,baseURL: 'https://your-gateway.example/v1',});
Azure OpenAI uses AzureOpenAI instead. Many third-party gateways support Chat Completions only; the examples below use client.responses.*, so confirm your provider supports the Responses API (especially for tools like web search).
Basic integration
Pass a string as input and read output_text from the response.
const response = await client.responses.create({model: 'gpt-5.5',input: 'Write a one-sentence bedtime story about a unicorn.',});console.log(response.output_text);
System prompt
Use top-level instructions for stable behavior (tone, format, role). They take precedence over casual wording in the user message.
const response = await client.responses.create({model: 'gpt-5.5',instructions: 'Reply in one short sentence. Use plain language.',input: 'Explain what an LLM is.',});console.log(response.output_text);
Few-shot prompting
Pass prior turns as an input array with user and assistant roles, then the new user message. Keep task rules in instructions.
const response = await client.responses.create({model: 'gpt-5.5',instructions:'Classify sentiment as exactly one word: positive, negative, or neutral.',input: [{ role: 'user', content: 'I love this!' },{ role: 'assistant', content: 'positive' },{ role: 'user', content: 'This is awful.' },{ role: 'assistant', content: 'negative' },{ role: 'user', content: 'It is fine I guess.' },],});console.log(response.output_text);
Streaming
Set stream: true and handle response.output_text.delta events for incremental text.
const stream = await client.responses.create({model: 'gpt-5.5',input: 'List three colors.',stream: true,});for await (const event of stream) {if (event.type === 'response.output_text.delta') {process.stdout.write(event.delta);}}
Structured output with JSON schema
Constrain the model to JSON matching your schema via text.format. With strict: true, the output should match the schema.
const response = await client.responses.create({model: 'gpt-5.5',input: 'The film Inception was directed by Christopher Nolan.',text: {format: {type: 'json_schema',name: 'movie_summary',strict: true,schema: {type: 'object',properties: {title: { type: 'string' },director: { type: 'string' },},required: ['title', 'director'],additionalProperties: false,},},},});const data = JSON.parse(response.output_text);console.log(data.title, data.director);
For typed parsing with Zod, you can use client.responses.parse() instead of JSON.parse.
Web search tool
Enable the built-in web search tool when the answer should use current information from the web.
const response = await client.responses.create({model: 'gpt-5.5',tools: [{ type: 'web_search' }],include: ['web_search_call.action.sources'],input: 'What was a major tech headline this week? Cite sources briefly.',});console.log(response.output_text);
Web search adds latency and tool usage cost. Use a model that supports tools.