Example Configurations
Complete self-hosted configuration examples for common deployment setups
These examples show complete .env and models.yaml configurations for common self-hosted setups. Each example assumes you have already run engrammic selfhost and have a ~/.engrammic/ directory with the compose file and a .env.
For the full variable reference, see Configuration. For Ollama-specific wiring, see Running Models Locally.
Ollama Only (Fully Local)
No cloud API keys. Both embeddings and LLM synthesis run through a local Ollama instance. Synthesis is active as long as Ollama is reachable.
This setup gives you full storage and recall with no data leaving your machine. Synthesis quality depends on the local model you choose. Reranking is not available without a cloud provider.
# Required core
ENGRAMMIC_LICENSE_KEY=ENGR_your_key_here
POSTGRES_PASSWORD=your-secure-password
# Embeddings via Ollama
EMBEDDING_MODEL=ollama/nomic-embed-text
EMBEDDING_DIMENSIONS=768
OLLAMA_API_BASE=http://host.docker.internal:11434
# Reranking — not available without a cloud provider
RERANKING__ENABLED=false
# LLM via Ollama (also needs models.yaml override)
OLLAMA_BASE_URL=http://host.docker.internal:11434
# Telemetry
TELEMETRY__ENABLED=falsedefault_tier: self_hosted
tiers:
self_hosted:
reasoning:
provider: ollama
model: llama3.2
fast:
provider: ollama
model: llama3.2
query_expander:
provider: ollama
model: llama3.2Pull the required models before starting:
ollama pull nomic-embed-text
ollama pull llama3.2Place the models.yaml override at ~/.engrammic/config/models.yaml, then start the stack:
docker compose up -dOpenAI
All AI calls go through OpenAI. Embeddings use text-embedding-3-small, reranking uses Cohere (OpenAI does not offer a reranking API).
# Required core
ENGRAMMIC_LICENSE_KEY=ENGR_your_key_here
POSTGRES_PASSWORD=your-secure-password
# Embeddings via OpenAI
EMBEDDING_MODEL=openai/text-embedding-3-small
EMBEDDING_DIMENSIONS=1536
OPENAI_API_KEY=sk-...
# Reranking via Cohere
RERANKING__ENABLED=true
RERANKING__PROVIDER=cohere
RERANKING__MODEL=rerank-english-v3.0
COHERE_API_KEY=...
# Telemetry
TELEMETRY__ENABLED=falsedefault_tier: self_hosted
tiers:
self_hosted:
reasoning:
provider: openai
model: gpt-4o
fast:
provider: openai
model: gpt-4o-mini
query_expander:
provider: openai
model: gpt-4o-miniIf you do not have a Cohere key, set RERANKING__ENABLED=false. Recall still works but results are not reranked.
Mixed (Local Embeddings, Cloud LLM)
Local embeddings keep your stored content on-device. Cloud LLM handles synthesis only. A practical balance between privacy and synthesis quality.
# Required core
ENGRAMMIC_LICENSE_KEY=ENGR_your_key_here
POSTGRES_PASSWORD=your-secure-password
# Embeddings via Ollama (local)
EMBEDDING_MODEL=ollama/nomic-embed-text
EMBEDDING_DIMENSIONS=768
OLLAMA_API_BASE=http://host.docker.internal:11434
# Reranking disabled (no cloud embedding = no cloud reranker)
RERANKING__ENABLED=false
# LLM via Anthropic (cloud)
ANTHROPIC_API_KEY=sk-ant-...
# Telemetry
TELEMETRY__ENABLED=falsedefault_tier: self_hosted
tiers:
self_hosted:
reasoning:
provider: anthropic
model: claude-sonnet-4-5
fast:
provider: anthropic
model: claude-haiku-4-5
query_expander:
provider: anthropic
model: claude-haiku-4-5Pull the embedding model before starting:
ollama pull nomic-embed-textVertex AI (GCP-Native)
Full GCP stack. Embeddings and reranking via Vertex AI, LLM via Gemini. Requires a GCP project with the Vertex AI API enabled and credentials available to the container.
# Required core
ENGRAMMIC_LICENSE_KEY=ENGR_your_key_here
POSTGRES_PASSWORD=your-secure-password
# Embeddings via Vertex AI
EMBEDDING_MODEL=vertex_ai/text-embedding-005
EMBEDDING_DIMENSIONS=768
VERTEX_PROJECT=my-gcp-project
VERTEX_LOCATION=us-central1
# Reranking via Vertex AI
RERANKING__ENABLED=true
RERANKING__PROVIDER=vertex
RERANKING__MODEL=semantic-ranker-default@latest
# Telemetry
TELEMETRY__ENABLED=falsedefault_tier: self_hosted
tiers:
self_hosted:
reasoning:
provider: vertex_ai
model: gemini-2.0-flash-001
fast:
provider: vertex_ai
model: gemini-2.0-flash-lite-001
query_expander:
provider: vertex_ai
model: gemini-2.0-flash-lite-001Vertex AI authentication uses Application Default Credentials (ADC). You must mount a service account key or configure Workload Identity so the container can reach the Vertex API. The simplest local approach is to run gcloud auth application-default login on the host and mount ~/.config/gcloud into the container.