graphile-pgvector
Graphile + pgvector Integration
Integrate pgvector with PostGraphile v5 (Grafast/graphile-build-pg) so that vector(n) columns are surfaced in GraphQL as a Vector scalar, and vector similarity search is available as first-class GraphQL query fields.
When to Apply
Use this skill when:
- Adding
vector(n)columns to tables and need them visible in the PostGraphile schema - Exposing vector similarity search (cosine, L2, inner product) as GraphQL queries
- Building AI/RAG features on top of a Constructive or PostGraphile v5 stack
- Writing integration tests for pgvector-backed GraphQL APIs
Two Moving Parts
| Component | Package | What it does |
|---|---|---|
VectorCodecPlugin |
graphile-settings |
Registers a codec for the vector PG type → Vector GraphQL scalar; makes embedding columns visible |
PgVectorPlugin / PgVectorPreset |
postgraphile-plugin-pgvector |
Adds vectorSearch<Table> root query fields with query, limit, offset, metric args |
Both must be in the preset for end-to-end vector support.
VectorCodecPlugin — Expose vector Columns as GraphQL Scalar
Without this plugin, PostGraphile silently ignores vector(n) columns. The codec teaches graphile-build-pg how to parse and serialize the type.
Wire Format
PostgreSQL sends vectors as text: [0.1,0.2,...,0.768]
JavaScript representation: number[]
GraphQL scalar: Vector (serialized as [Float])
Source Location
graphile-settings/src/plugins/vector-codec.ts — included automatically in ConstructivePreset.
Pure Helper Exports (Testable Standalone)
import { fromPgVector, toPgVector, vectorScalarConfig } from 'graphile-settings/plugins/vector-codec';
// Parse PG text → JS
fromPgVector('[0.1,0.2,0.3]'); // => [0.1, 0.2, 0.3]
// Serialize JS → PG text
toPgVector([0.1, 0.2, 0.3]); // => '[0.1,0.2,0.3]'
Plugin Structure
export const VectorCodecPlugin: GraphileConfig.Plugin = {
name: 'VectorCodecPlugin',
version: '1.0.0',
// Gather phase: hook into codec discovery; register codec when type.typname === 'vector'
gather: {
hooks: {
async pgCodecs_findPgCodec(info, event) {
if (event.pgCodec) return;
const { pgType: type, serviceName } = event;
if (type.typname !== 'vector') return;
// ... resolve namespace, build codec with fromPg/toPg, set event.pgCodec
},
},
},
// Schema phase: register Vector scalar and map the codec to it
schema: {
hooks: {
init: {
before: ['PgCodecs'],
callback(_, build) {
build.registerScalarType('Vector', {}, () => vectorScalarConfig, '...');
for (const codec of Object.values(build.input.pgRegistry.pgCodecs)) {
if ((codec as any).name === 'vector') {
setGraphQLTypeForPgCodec(codec as any, 'input', 'Vector');
setGraphQLTypeForPgCodec(codec as any, 'output', 'Vector');
}
}
return _;
},
},
},
},
};
Key Hook: pgCodecs_findPgCodec
This gather hook fires for every unknown PG type. Return early if event.pgCodec is already set (already handled). Check type.typname === 'vector' and assign a codec with fromPg/toPg functions.
Key Hook: init (before PgCodecs)
Must run before PgCodecs processes the registry. Registers the Vector scalar and maps it to the codec for both input and output GraphQL roles.
PgVectorPlugin — Vector Search GraphQL Fields
Adds root query fields for similarity search across configured tables.
Installation
pnpm add postgraphile-plugin-pgvector
Usage — Preset (Recommended)
import { PgVectorPreset } from 'postgraphile-plugin-pgvector';
const preset: GraphileConfig.Preset = {
extends: [
ConstructivePreset, // includes VectorCodecPlugin
PgVectorPreset({
collections: [
{
schema: 'public',
table: 'documents',
embeddingColumn: 'embedding',
graphqlFieldName: 'vectorSearchDocument', // optional; auto-generated if omitted
maxQueryDim: 768, // optional defensive validation
},
{
schema: 'public',
table: 'contacts',
embeddingColumn: 'embedding',
},
],
defaultMetric: 'COSINE', // COSINE | L2 | IP
maxLimit: 100,
}),
],
};
Usage — Plugin Directly
import { PgVectorPlugin } from 'postgraphile-plugin-pgvector';
const plugin = PgVectorPlugin({
collections: [{ schema: 'public', table: 'documents', embeddingColumn: 'embedding' }],
});
Generated GraphQL API
type Query {
vectorSearchDocument(
query: [Float!]! # embedding vector (must match column dimensions)
limit: Int # default 10, capped at maxLimit
offset: Int # pagination
metric: VectorMetric # COSINE (default), L2, IP
): [DocumentVectorSearchResult!]!
}
enum VectorMetric { COSINE L2 IP }
type DocumentVectorSearchResult {
distance: Float! # distance score; lower = more similar for COSINE/L2
id: Int!
title: String!
# ... all other table columns
}
GraphQL Query Example
query SearchDocuments($vector: [Float!]!) {
vectorSearchDocument(query: $vector, limit: 5, metric: COSINE) {
distance
id
title
content
}
}
Distance Metrics
| Metric | Operator | Range | Notes |
|---|---|---|---|
COSINE |
<=> |
0–2 | 0 = identical; recommended for normalized embeddings |
L2 |
<-> |
0–∞ | Euclidean distance |
IP |
<#> |
−∞–0 | Negative inner product; higher (less negative) = more similar |
Codegen — Vector Type → number[]
When using @constructive-io/graphql-codegen, add the Vector scalar mapping so generated types use number[] instead of unknown:
// graphql-codegen.config.ts
export default createConfig({
targets: [{
// ...
scalars: {
Vector: 'number[]',
},
}],
});
With @constructive-io/graphql-codegen >= 4.6.0, Vector: 'number[]' is built-in to SCALAR_TS_MAP.
For 4.2.0, apply a pnpm patch adding scalars?: Record<string, string> to the config type defs.
SQL Search Functions (Alternative: Direct SQL)
For route-level RAG (bypassing GraphQL), define SQL functions instead:
CREATE OR REPLACE FUNCTION your_schema.search_contacts(
query_embedding vector(768),
match_count int DEFAULT 10
)
RETURNS TABLE (
id uuid,
entity_id text,
first_name text,
last_name text,
email text,
headline text,
similarity float
)
LANGUAGE sql STABLE
AS $$
SELECT
id, entity_id, first_name, last_name, email, headline,
1 - (embedding <=> query_embedding) AS similarity
FROM your_schema.contacts
WHERE embedding IS NOT NULL
ORDER BY embedding <=> query_embedding
LIMIT match_count;
$$;
These are surfaced automatically in the GraphQL schema (PostGraphile computed columns / functions) if the naming and return type conventions match. The VectorCodecPlugin is still required so that the vector argument type resolves correctly.
Integration Tests
Use graphile-test's getConnections with real DB. No mocks.
// __tests__/pgvector.test.ts
import { join } from 'path';
import { getConnections, seed } from 'graphile-test';
import { PgVectorPreset } from 'postgraphile-plugin-pgvector';
import { ConstructivePreset } from 'graphile-settings';
describe('PgVectorPlugin', () => {
let db: PgTestClient;
let teardown: () => Promise<void>;
let query: QueryFn;
beforeAll(async () => {
const testPreset = {
extends: [
ConstructivePreset,
PgVectorPreset({
collections: [{
schema: 'pgvector_test',
table: 'documents',
embeddingColumn: 'embedding',
graphqlFieldName: 'vectorSearchDocument',
}],
}),
],
};
const connections = await getConnections({
schemas: ['pgvector_test'],
preset: testPreset,
useRoot: true,
}, [
seed.sqlfile([join(__dirname, './setup.sql')])
]);
({ db, teardown, query } = connections);
await db.client.query('BEGIN');
});
afterAll(async () => {
try { await db.client.query('ROLLBACK'); } catch {}
await teardown();
});
beforeEach(() => db.beforeEach());
afterEach(() => db.afterEach());
it('returns vector search results ordered by distance', async () => {
// Insert test doc with pre-computed embedding
const embedding = Array(768).fill(0.1);
await db.client.query(
`INSERT INTO pgvector_test.documents (title, content, embedding)
VALUES ($1, $2, $3::vector)`,
['Test Doc', 'Hello world', `[${embedding.join(',')}]`]
);
const result = await query<{ vectorSearchDocument: Array<{ title: string; distance: number }> }>(`
query {
vectorSearchDocument(query: [${embedding.join(',')}], limit: 5, metric: COSINE) {
title
distance
}
}
`);
expect(result.data?.vectorSearchDocument).toHaveLength(1);
expect(result.data?.vectorSearchDocument[0].title).toBe('Test Doc');
expect(result.data?.vectorSearchDocument[0].distance).toBeCloseTo(0, 5);
});
});
Setup SQL
-- __tests__/setup.sql
CREATE SCHEMA IF NOT EXISTS pgvector_test;
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE pgvector_test.documents (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
content TEXT,
embedding vector(768),
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX ON pgvector_test.documents USING hnsw (embedding vector_cosine_ops);
Key graphile-test Pattern Notes
getConnections({ schemas, preset, useRoot: true }, [seed.sqlfile([...])])— no mocksgraphile-testhasMinimalPresetbaked in; pass your plugin preset asextends- SQL fixture must
CREATE EXTENSION IF NOT EXISTS vector;before creating vector columns - Use
BEGIN/ROLLBACKfor transaction-scoped test isolation (notbeforeEachschema drops)
Codec-Only Usage (No Search Fields)
If you only need vector columns visible in the schema (not search fields), just include VectorCodecPlugin in your preset:
import { VectorCodecPlugin } from 'graphile-settings/plugins/vector-codec';
// or use ConstructivePreset which includes it automatically
const preset = {
plugins: [VectorCodecPlugin],
};
After adding: embedding: number[] | null appears in all generated types, mutations accept Vector scalar, and SQL functions with vector args are surfaced in GraphQL.
Common Pitfalls
| Issue | Cause | Fix |
|---|---|---|
embedding field missing from schema |
vector type not recognized |
Add VectorCodecPlugin (or ConstructivePreset) |
Unknown type "Vector" in codegen |
Scalar not mapped | Add scalars: { Vector: 'number[]' } to codegen config |
SQL function with vector arg not surfaced |
Same as above | VectorCodecPlugin must be present |
VectorMetricEnum duplicate type error |
Multiple PgVectorPreset instances |
Create the enum once outside the loop or deduplicate by name |
withPgClient not in context |
Missing makePgService in preset |
Ensure pgServices is configured in your PostGraphile preset |
getResource returning null in init hook |
Table not in schemas list |
Add the schema to schemas in getConnections or PostGraphile config |
dimension mismatch error from pgvector |
Query vector length ≠ column dimension | Set maxQueryDim in collection config; validate before calling |
References
- Related skill:
pgvector-setup— DB schema with HNSW indexes - Related skill:
pgvector-embeddings— generating and storing embeddings - Related skill:
rag-pipeline— full RAG pipeline from embed to LLM - Related skill:
agentic-kit-rag— wiring RAG into agentic-kit chat routes - pgvector docs
- PostGraphile v5 plugin guide
- Grafast step plan API
More from constructive-io/constructive-skills
drizzle-orm
Drizzle ORM patterns for PostgreSQL schema design and queries. Use when asked to "design Drizzle schema", "write Drizzle queries", "set up Drizzle ORM", or when building type-safe database layers.
21planning-blueprinting
In-repo planning and specification system for software projects. Use when asked to "create a plan", "write a spec", "document a proposal", "blueprint a feature", or when doing architectural planning work.
20pgsql-parser-testing
Test the pgsql-parser repository (SQL parser/deparser). Use when working in the pgsql-parser repo, fixing deparser issues, running parser tests, or validating SQL round-trips. Scoped specifically to the constructive-io/pgsql-parser repository.
18constructive-graphql-codegen
Generate type-safe React Query hooks, Prisma-like ORM client, or inquirerer-based CLI from GraphQL endpoints, schema files/directories, databases, or PGPM modules using @constructive-io/graphql-codegen. Also generates documentation (README, AGENTS.md, skills/, mcp.json). Use when asked to "generate GraphQL hooks", "generate ORM", "generate CLI", "set up codegen", "generate docs", "generate skills", "export schema", or when implementing data fetching for a PostGraphile backend.
17constructive-server-config
Configure and run the Constructive GraphQL server (cnc server), GraphiQL explorer (cnc explorer), and code generation (cnc codegen). Use when asked to "start the server", "run cnc server", "start GraphQL API", "run GraphiQL", "configure API routing", "generate types", or when working with the Constructive CLI and PostGraphile.
17constructive-boilerplate-nextjs-app
Set up and develop with the Constructive App frontend boilerplate — a Next.js application with authentication, organization management, invites, members, and a GraphQL SDK. Use when scaffolding a new Constructive frontend application from the boilerplate.
17