Skip to content

Infrastructure

Substrate's infrastructure layer is deliberately spartan: one Postgres instance (with AGE + pgvector), one Keycloak instance, and two llama.cpp workers for local AI. Prod TLS is handled upstream by home-stack's nginx-proxy-manager — substrate bundles no reverse proxy of its own.


Overview

Component Technology Host port Purpose
Primary DB PostgreSQL 16 5432 Relational, graph, embeddings, SSE
Graph extension Apache AGE Cypher inside Postgres
Vector extension pgvector 896-dim embeddings
Identity Keycloak 26 8080 OIDC, JWT
AI inference lazy-lamacpp 8101 (embeddings), 8102 (dense) Local LLM serving
DB admin pgadmin 4 5050 DB introspection

PostgreSQL

Role

PostgreSQL is the single source of truth for all Substrate data: - Relational metadata (sources, syncs, schedules, issues) - Vector embeddings (pgvector) - Graph topology (Apache AGE) - SSE replay buffer (sse_events table)

Databases

The Postgres instance hosts two logical databases:

Database Owner Purpose
substrate_graph substrate_graph All substrate data (graph, chunks, embeddings, SSE)
keycloak keycloak Keycloak's own state

There is no substrate_ingestion database. Ingestion and graph share substrate_graph.

Extensions

CREATE EXTENSION IF NOT EXISTS age;      -- Cypher
CREATE EXTENSION IF NOT EXISTS vector;   -- pgvector

Connection pools

Every service uses asyncpg pools against substrate_graph:

  • Graph service — default pool sizing
  • Ingestion service — sized for heavy background writes, configured via asyncpg defaults; uses UNWIND … batching via graph_writer.py::write_age_nodes/edges (batches of 500 with per-row fallback) to avoid saturating the pool
  • Gateway — a small pool used only for the SSE replay path (sse_endpoint.py), which does SELECT … FROM sse_events WHERE id > $last, then LISTEN substrate_sse

Every pool registers an init callback that runs LOAD 'age' and sets server_settings={"search_path":"ag_catalog,public"} so Cypher queries work on any pooled connection.

For high-scale prod, PgBouncer in front of Postgres is viable but not currently part of the default compose.


Apache AGE

Role

Apache AGE adds Cypher graph queries to Postgres as an extension. Substrate uses it instead of running a separate Neo4j server.

Graph

SELECT * FROM ag_catalog.create_graph('substrate');

The graph is named substrate and holds :File vertices plus depends_on and defines edges. See docs/architecture/data-model.md for the schema.

Cypher execution

SELECT * FROM cypher('substrate', $$
  MATCH (a:File)-[r:depends_on]->(b:File)
  WHERE r.sync_id IN ['uuid1', 'uuid2']
  RETURN a.file_id, b.file_id, r.weight
$$) AS (result agtype);

All pool connections run LOAD 'age' on init (and search_path is set via server_settings, not per-query, because RESET ALL on pool release wipes in-session SETs).

AGE expression indexes (migration V5) make MATCH (f:File {file_id: '...'}) lookups logarithmic against the File vertex table.


pgvector

Role

Stores 896-dimensional embeddings produced by jina-code-embeddings-0.5b.

Columns

Table Column Type
file_embeddings embedding vector(896)
content_chunks embedding vector(896)

Search query

SELECT id, name, file_path, embedding <=> $1 AS distance
FROM file_embeddings
WHERE type = 'source'
ORDER BY embedding <=> $1
LIMIT 10;

The <=> operator computes cosine distance. The graph service uses this for /api/graph/search; the ingestion service only writes — no reads of embedding columns on the ingestion side.

Dim migrations are tracked in services/graph/migrations/postgres/ (V4 → V7 → V8 → V9 → V10, currently at 896-dim). A startup guard (services/graph/src/startup.py::check_embedding_dim) verifies the column's declared dimension matches EMBEDDING_DIM and fails the graph service at boot on mismatch.


Keycloak

Role

Identity provider for OIDC authentication and JWT issuance.

Realm

  • Realm: substrate (imported from ops/infra/keycloak/substrate-realm.json, which is rendered from the committed template by scripts/render-realm.py)
  • Frontend client: substrate-frontend — public, PKCE-S256
  • Gateway client: substrate-gateway — confidential, service-accounts enabled (secret comes from KC_GATEWAY_CLIENT_SECRET in the active env file)
  • Issuer: ${KC_HOSTNAME}/realms/substrate (e.g. http://localhost:8080/realms/substrate in dev, https://auth.<domain>/realms/substrate in prod)
  • JWKS endpoint: ${KC_HOSTNAME}/realms/substrate/protocol/openid-connect/certs

Command mode

  • Dev: start-dev --import-realm with KC_HOSTNAME_STRICT=false
  • Prod: start --import-realm with KC_HOSTNAME_STRICT=true and KC_PROXY_HEADERS=xforwarded so NPM-forwarded X-Forwarded-Proto: https is honored

Token characteristics

  • Algorithm: RS256
  • JWKS cached in the gateway for 5 minutes with background refresh
  • Audience verification is disabled in the gateway (verify_aud=False) — issuer + signature + expiry are enforced

lazy-lamacpp (local AI inference)

Runs on the host via systemd-user units, not inside compose. Substrate's containers reach it via host.docker.internal — the one justified use of that host alias.

Models currently served

Role Model Port Notes
embeddings jina-code-embeddings-0.5b Q8_0 8101 896-dim, 32 k context
dense Qwen3.5-2B Q8_0 8102 60 k context, used for enriched summaries

Additional model roles (sparse, reranker, coding) are defined under ops/llm/lazy-lamacpp/config/models/ but are on-demand only — the embeddings + dense pair is required concurrently.

Starting / stopping / status

cd ops/llm/lazy-lamacpp
make start MODEL=embeddings
make start MODEL=dense
make status MODEL=embeddings
make status-all
make stop MODEL=embeddings

The top-level Substrate Makefile does not re-export these targets — manage lazy-lamacpp directly from its own Makefile.

API compatibility

Both ports expose OpenAI-compatible endpoints:

  • POST /v1/embeddings
  • POST /v1/chat/completions

VRAM budget

Both workers must fit simultaneously in the host's 4 GB VRAM (Quadro P1000 Mobile):

  • Embeddings Q8_0 weights ≈ 600 MiB + KV cache with Q8_0 quantization ≈ 500 MiB → ~1.1 GB
  • Dense Q8_0 weights ≈ 1.9 GB + 60 k-token Q8_0 KV cache ≈ 1.1 GB → ~2.85 GB
  • Combined ≈ 4 GB with ~25 MiB headroom

See ops/llm/lazy-lamacpp/AGENTS.md for the full accounting and the rationale behind simultaneous GPU residency.


pgadmin

Deployed in both modes. Container listens on 80, published on host 5050. Servers pre-registered via ops/infra/pgadmin/servers.json:

  • substrate_graph — the main substrate DB
  • keycloak — Keycloak's DB
  • postgres (superuser) — full admin

In prod, home-stack's NPM exposes this at pgadmin.<domain> (typically behind an IP allowlist at the NPM layer).


Resource requirements

Development

Component CPU Memory Storage
PostgreSQL 2 cores 2 GB 20 GB
Keycloak 1 core 1 GB 5 GB
lazy-lamacpp 2 cores 4 GB (VRAM shared 4 GB) 10 GB

Production

Component CPU Memory Storage
PostgreSQL 4 cores 8 GB 200 GB SSD
Keycloak 2 cores 2 GB 20 GB
lazy-lamacpp 4 cores 8 GB (VRAM 6+ GB) 20 GB

Health checks

# Postgres
pg_isready -U postgres -h localhost

# Keycloak
curl http://localhost:8080/health/ready    # uses port 9000 inside the container;
                                           # compose.yaml's healthcheck uses a
                                           # raw TCP probe to bypass strict hostname

# lazy-lamacpp
curl http://localhost:8101/v1/models
curl http://localhost:8102/v1/models

# Full substrate sweep
make doctor

Backup strategy

# Single-DB substrate backup
pg_dump -h localhost -U substrate_graph substrate_graph > substrate_graph.sql

# Keycloak state
pg_dump -h localhost -U keycloak keycloak > keycloak.sql

For prod, prefer WAL archiving + point-in-time recovery at the Postgres layer, managed by whatever runs the home-stack Postgres container.


Security

Network

  • All inter-service traffic rides the substrate_internal Docker bridge
  • Only the debug ports (3535 / 8080 / 5050 / 8180 / 8181 / 8182 / 5432) are published on the host
  • Prod: TLS terminated at home-stack's NPM; substrate sees plain HTTP on internal ports with X-Forwarded-Proto: https headers forwarded by NPM

Data at rest

  • PostgreSQL: filesystem-level encryption recommended (LUKS / dm-crypt on the host)
  • No sensitive data in lazy-lamacpp model caches (models are public GGUFs)
  • .env.local and .env.prod are gitignored and live on local disk only

Access control

  • Separate DB users per logical database (substrate_graph, keycloak)
  • Minimal privileges (no superuser access from applications)
  • Keycloak realm import driven by a gitignored rendered file (template + template variables live in git, secrets do not)