homeresume
 
   
🔍

RAG with OpenAI Embeddings, pgvector and LangChain

June 2, 2026

Retrieval-Augmented Generation (RAG) is a practical pattern: store knowledge as embeddings, retrieve the most relevant chunks with semantic search, then generate an answer grounded in that context.

This guide shows a simple end-to-end flow with OpenAI embeddings, PostgreSQL + pgvector, and LangChain chunking.

Architecture at a glance

Prerequisites

  • OpenAI account
  • Generated API key
  • Enabled billing
  • Node.js version 26
  • PostgreSQL with pgvector extension enabled
  • npm packages: openai, langchain, pg, pgvector

What are embeddings?

Embeddings are numeric vectors that represent the semantic meaning of text. Similar text should produce vectors that are close in vector space.

In practice:

  • Convert document chunks to vectors and store them in pgvector
  • Convert a user question to a vector
  • Run a nearest-neighbor search to find the most relevant chunks

OpenAI client setup

import OpenAI from 'openai';
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

Embedding one input element

Use a single string when embedding a user query.

const response = await client.embeddings.create({
model: 'text-embedding-3-small',
input: 'How do I connect pgvector to PostgreSQL?',
});
const queryEmbedding = response.data[0].embedding;
console.log(queryEmbedding.length);

Embedding multiple input elements

Use an array to embed multiple chunks in one API call.

const chunks = [
'pgvector adds vector similarity search to PostgreSQL.',
'LangChain helps split long documents into retrieval-friendly chunks.',
'RAG retrieves context first, then asks an LLM to answer.',
];
const response = await client.embeddings.create({
model: 'text-embedding-3-small',
input: chunks,
});
const rows = response.data.map((item, index) => ({
text: chunks[index],
embedding: item.embedding,
}));
console.log(rows.length); // 3

Chunking documents with LangChain

Chunking makes retrieval more precise. Instead of embedding one large document, split it into smaller overlapping parts. Start with chunkSize: 800 and chunkOverlap: 120, then adjust based on your document style and answer quality.

import { RecursiveCharacterTextSplitter } from 'langchain/text_splitter';
const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 800,
chunkOverlap: 120,
});
const docs = await splitter.createDocuments([
`RAG combines retrieval and generation. Store chunks as vectors and fetch similar chunks at query time.`,
]);
console.log(docs.map((doc) => doc.pageContent));

Store embeddings in pgvector

Create a table with a vector column. text-embedding-3-small outputs 1536 dimensions.

CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE IF NOT EXISTS rag_chunks (
id BIGSERIAL PRIMARY KEY,
content TEXT NOT NULL,
embedding VECTOR(1536) NOT NULL,
source TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Insert chunk vectors from Node.js:

import pg from 'pg';
import pgvector from 'pgvector/pg';
const pool = new pg.Pool({ connectionString: process.env.DATABASE_URL });
await pgvector.registerTypes(pool);
await pool.query(
`INSERT INTO rag_chunks (content, embedding, source)
VALUES ($1, $2, $3)`,
['Chunk content', pgvector.toSql(queryEmbedding), 'notes.md']
);

Semantic search in pgvector

Embed the user question, then retrieve nearest chunks using cosine distance. Lower distance means a closer semantic match. top-k means how many nearest chunks you return (in this query, k=4 with LIMIT 4). You can also use a simple threshold (for example 0.4) to discard weak matches. As a starting point, many setups work well in the 0.35 to 0.45 range for cosine distance, then tune with real questions from your domain.

const searchResult = await pool.query(
`SELECT id, content, source, embedding <=> $1::vector AS distance
FROM rag_chunks
ORDER BY embedding <=> $1::vector
LIMIT 4`,
[pgvector.toSql(queryEmbedding)]
);
const contextChunks = searchResult.rows.map((row) => row.content);

Threshold filtering example:

const DISTANCE_THRESHOLD = 0.4;
const filteredChunks = searchResult.rows
.filter((row) => Number(row.distance) <= DISTANCE_THRESHOLD)
.map((row) => row.content);

If no chunks pass the threshold, skip answer generation and return a fallback message:

if (filteredChunks.length === 0) {
console.log('I do not have enough context to answer this.');
process.exit(0);
}

Generate answer from retrieved context

Use retrieved chunks as grounded context for the final model call.

const context = contextChunks.join('\n\n---\n\n');
const answer = await client.responses.create({
model: 'gpt-5.5',
instructions:
'Answer only from the provided context. If context is insufficient, respond with: I do not have enough context to answer this.',
input: `Context:\n${context}\n\nQuestion: How does pgvector semantic search work?`,
});
console.log(answer.output_text);

Demo

Runnable scripts for this post live in the rag-openai-embeddings-pgvector-demo folder in the private demos repository. Get access via code demos.

2023

Postgres and Redis containers with Docker Compose

February 26, 2023

Docker Compose facilitates spinning up containers for databases without installing the databases locally. This post covers the setup for Postgres and Redis images.

Prerequisites

  • Docker Compose installed

Configuration

The following configuration spins up Postgres and Redis containers with UI tools (Pgweb and Redis Commander).

Connection strings for Postgres and Redis are redis://localhost:6379 and postgres://username:password@localhost:5435/database-name.

Pgweb and Redis Commander are available at http://localhost:8085 and http://localhost:8081 links.

# docker-compose.yml
version: '3.8'
services:
postgres:
image: postgres:alpine
environment:
POSTGRES_DB: database-name
POSTGRES_PASSWORD: password
POSTGRES_USER: username
ports:
- 5435:5432
restart: on-failure:3
pgweb:
image: sosedoff/pgweb
depends_on:
- postgres
environment:
PGWEB_DATABASE_URL: postgres://username:password@postgres:5432/database-name?sslmode=disable
ports:
- 8085:8081
restart: on-failure:3
redis:
image: redis:latest
command: redis-server
volumes:
- redis:/var/lib/redis
- redis-config:/usr/local/etc/redis/redis.conf
ports:
- 6379:6379
networks:
- redis-network
redis-commander:
image: rediscommander/redis-commander:latest
environment:
- REDIS_HOSTS=local:redis:6379
- HTTP_USER=root
- HTTP_PASSWORD=qwerty
ports:
- 8081:8081
networks:
- redis-network
depends_on:
- redis
volumes:
redis:
redis-config:
networks:
redis-network:
driver: bridge

Run the following command to spin up the containers.

docker-compose up
2021

Postgres and Redis container services for e2e tests in Github actions

September 8, 2021

End-to-end tests should use a real database connection, and provisioning container service for the Postgres database can be automated using Github actions. The environment variable for the connection string for the newly created database can be set in the step for running e2e tests. The same goes for the Redis instance.

# ...
jobs:
build:
# Container must run in Linux-based operating systems
runs-on: ubuntu-latest
# Image from Docker hub
container: node:20.9.0-alpine3.17
# ...
strategy:
matrix:
# ...
database-name:
- e2e-testing-db
database-user:
- username
database-password:
- password
database-host:
- postgres
database-port:
- 5432
redis-host:
- redis
redis-port:
- 6379
services:
postgres:
image: postgres:latest
env:
POSTGRES_DB: ${{ matrix.database-name }}
POSTGRES_USER: ${{ matrix.database-user }}
POSTGRES_PASSWORD: ${{ matrix.database-password }}
ports:
- ${{ matrix.database-port }}:${{ matrix.database-port }}
# Set health checks to wait until postgres has started
options: --health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
redis:
image: redis
# Set health checks to wait until redis has started
options: >-
--health-cmd "redis-cli ping"
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
# ...
- run: npm run test:e2e
env:
DATABASE_URL: postgres://${{ matrix.database-user }}:${{ matrix.database-password }}@${{ matrix.database-host }}:${{ matrix.database-port }}/${{ matrix.database-name }}
REDIS_URL: redis://${{ matrix.redis-host }}:${{ matrix.redis-port }}
# ...
2020