Self-Hosting

Quick Start

The fastest way to get Engrammic running locally:

curl -fsSL https://get.engrammic.ai/install.sh | sh
engrammic selfhost

The wizard will:

Check Docker is running
Validate your license key
Configure ports and storage
Set up LLM integration (optional)
Start the services
Configure your code editor

Deployment Tiers

Choose a tier based on your hardware and whether you want fully local inference:

Tier	RAM	Models	Use Case
Lite	8GB	phi4-mini + bge-base-en-v1.5 (TEI)	Laptops, dev machines
Standard	24-32GB	gemma4:12b + nomic-embed-text-v2-moe + bge-reranker-v2-m3	Workstations
Pro	48-64GB	gemma4:26b + nomic-embed-text-v2-moe + bge-reranker-v2-m3	Servers, production
Cloud	--	Your API keys (OpenAI, Anthropic, etc.)	When you prefer cloud APIs

Standalone tiers (Lite/Standard/Pro) bundle Ollama and TEI so everything runs locally with no external API calls after initial model download. Cloud tier uses your existing API keys. The installer auto-detects your RAM and recommends a tier.

Requirements

Docker 20.10+ with Compose v2
RAM per tier (see table above)
20GB disk space (standalone tiers need more for model storage)
License key from engrammic.ai/self-hosted

Standalone vs Cloud tier

Standalone (Lite/Standard/Pro): Models bundled via Ollama + TEI. No API keys needed. First startup downloads models (may take 10-30 min).
Cloud: You provide embedding and LLM API keys. Faster startup, requires internet.

See Running Models Locally for more on local model configuration.

CLI Reference

Installation Commands

Command	Description
`engrammic selfhost`	Interactive setup wizard
`engrammic install`	Configure MCP for detected editors
`engrammic uninstall`	Remove Engrammic from editor configs

Management Commands

Command	Description
`engrammic status`	Show current installation state
`engrammic upgrade`	Pull latest images and restart
`engrammic doctor`	Run diagnostic health checks
`engrammic scale`	Show container resource usage
`engrammic logs`	View service logs
`engrammic license`	View or update license key

Log Viewing

# Interactive service selection
engrammic logs

# Specific service
engrammic logs --service app

# Follow logs
engrammic logs --service app --follow

# Show more lines
engrammic logs --service dagster --lines 500

Available services: app, reaction-worker, dagster, dagster-daemon, memgraph, qdrant, redis, postgres

Configuration

After running engrammic selfhost, your configuration is stored in ~/.engrammic/:

~/.engrammic/
  docker-compose.yml   # Service definitions
  .env                 # Configuration
  README.md            # Quick reference

Environment Variables

Edit ~/.engrammic/.env to configure:

# License (required)
ENGRAMMIC_LICENSE_KEY=ENGR_...

# Database
POSTGRES_PASSWORD=your-secure-password

# Embedding model (required, otherwise remember/learn/recall fail)
EMBEDDING_MODEL=openai/text-embedding-3-small
OPENAI_API_KEY=sk-...

# Telemetry (opt out with false)
TELEMETRY__ENABLED=true

The embedding model is the required piece, and you set it here with EMBEDDING_MODEL plus its provider key. The generation LLM is optional and is configured differently (see below). See Configuration for the full variable reference and Running Models Locally to run either without cloud keys.

Choosing the generation LLM

The SAGE generation LLM provider and model live in config/models.yaml, not in .env. The default tier uses Google Vertex (Gemini). To use a different provider:

Override models.yaml through the host config directory (see Config Files).
Set the active tier's reasoning, fast, and query_expander entries to your provider and model.
Supply that provider's credentials in .env:

Provider in models.yaml	Credentials in .env
`openai`	`OPENAI_API_KEY=sk-...`
`anthropic`	`ANTHROPIC_API_KEY=sk-ant-...`
`gemini`	`GEMINI_API_KEY=...`
`vertex` (default)	`VERTEX_PROJECT=...`, `VERTEX_LOCATION=...` (uses the mounted gcloud credentials)
`ollama`	`OLLAMA_BASE_URL=http://host.docker.internal:11434`

Without usable LLM credentials, Engrammic runs in passive mode: storage and recall work, but SAGE synthesis (automatic deduplication, insight generation) is disabled.

Custom Ports

During engrammic selfhost, you can configure custom ports:

MCP server: Default 8000 (your editor connects here)
Dagster UI: Default 3000 (SAGE pipeline dashboard)

After setup, edit docker-compose.yml to change internal service ports.

Upgrading

Pull the latest images and restart:

engrammic upgrade

This will:

Pull new images
Restart services
Clean up old images

Version deprecation warnings appear in logs when running outdated versions.

Backup and Restore

Manual Backup

cd ~/.engrammic

# Stop services
docker compose down

# Backup volumes
docker run --rm -v engrammic_postgres-data:/data -v $(pwd):/backup alpine \
  tar czf /backup/postgres-backup.tar.gz -C /data .

docker run --rm -v engrammic_qdrant-data:/data -v $(pwd):/backup alpine \
  tar czf /backup/qdrant-backup.tar.gz -C /data .

docker run --rm -v engrammic_memgraph-data:/data -v $(pwd):/backup alpine \
  tar czf /backup/memgraph-backup.tar.gz -C /data .

# Restart
docker compose up -d

Restore

cd ~/.engrammic
docker compose down

# Restore volumes (example for postgres)
docker run --rm -v engrammic_postgres-data:/data -v $(pwd):/backup alpine \
  sh -c "rm -rf /data/* && tar xzf /backup/postgres-backup.tar.gz -C /data"

docker compose up -d

Troubleshooting

Health Check

curl http://localhost:8000/health

Expected response:

{
  "status": "healthy",
  "services": {
    "memgraph": "connected",
    "redis": "connected",
    "qdrant": "connected",
    "postgres": "connected"
  }
}

Diagnostics

engrammic doctor

Checks:

Docker daemon status
Container health
Service connectivity
License validity

Common Issues

Services not starting:

docker compose logs app

Port already in use: Run engrammic selfhost again and choose a different port.

License expired:

engrammic license

Out of memory:

engrammic scale

Manual Setup

If you prefer manual configuration over the CLI wizard:

Standalone Tiers (Lite/Standard/Pro)

For fully local deployment with bundled models:

Create directory

mkdir -p ~/.engrammic && cd ~/.engrammic

Download tier-specific compose file

# Choose one:
# Lite (8GB RAM)
curl -fsSL https://raw.githubusercontent.com/engrammic-ai/context-service/main/docker/docker-compose.standalone-lite.yml -o docker-compose.yml

# Standard (24-32GB RAM)
curl -fsSL https://raw.githubusercontent.com/engrammic-ai/context-service/main/docker/docker-compose.standalone-standard.yml -o docker-compose.yml

# Pro (48-64GB RAM)
curl -fsSL https://raw.githubusercontent.com/engrammic-ai/context-service/main/docker/docker-compose.standalone-pro.yml -o docker-compose.yml

Create .env

cat > .env << 'EOF'
# License (required)
ENGRAMMIC_LICENSE_KEY=ENGR_your_key_here

# Database
POSTGRES_PASSWORD=your-secure-password

# Telemetry (opt out with false)
TELEMETRY__ENABLED=true
EOF

No embedding or LLM API keys needed - models are bundled.

Start services

docker compose up -d

First startup downloads models via Ollama and TEI. This takes 10-30 minutes depending on your connection. Monitor with:

docker compose logs -f ollama

Configure your editor

Add to your MCP config (e.g., ~/.claude/settings.json):

{
  "mcpServers": {
    "engrammic": {
      "type": "http",
      "url": "http://localhost:8000/mcp"
    }
  }
}

Cloud Tier

For deployment using your own API keys:

Create directory

mkdir -p ~/.engrammic && cd ~/.engrammic

Download compose file

curl -fsSL https://raw.githubusercontent.com/engrammic-ai/context-service/main/docker/docker-compose.selfhosted.yml -o docker-compose.yml

Create .env

cat > .env << 'EOF'
ENGRAMMIC_LICENSE_KEY=ENGR_your_key_here
POSTGRES_PASSWORD=your-secure-password

# Embedding model (required)
EMBEDDING_MODEL=openai/text-embedding-3-small
OPENAI_API_KEY=sk-...

TELEMETRY__ENABLED=true

# Optional: LLM for SAGE synthesis (provider/model set in models.yaml; supply its key)
# OPENAI_API_KEY=sk-...
EOF

Start services

docker compose up -d

Configure your editor

Add to your MCP config (e.g., ~/.claude/settings.json):

{
  "mcpServers": {
    "engrammic": {
      "type": "http",
      "url": "http://localhost:8000/mcp"
    }
  }
}

Architecture

The self-hosted stack includes:

Service	Purpose	Default Port
`app`	MCP server + REST API	8000
`dagster`	SAGE pipeline UI	3000
`memgraph`	Graph database	7687
`qdrant`	Vector database	6333
`redis`	Cache + queues	6379
`postgres`	Metadata store	5432

Standalone tiers add:

Service	Purpose	Tiers
`ollama`	LLM inference	Lite, Standard, Pro
`tei`	Embedding inference	Lite, Standard, Pro
`tei-reranker`	Reranking	Standard, Pro

Memory requirements by tier:

Cloud: ~5GB
Lite: ~8GB
Standard: ~24GB
Pro: ~48GB

engrammic.ai/self-hosted - license keys and release notes
Configuration reference - full environment variable reference
Running Models Locally - Ollama and TEI setup, no cloud keys required
LiteLLM - embedding and LLM routing used internally
Ollama - run LLMs and embedding models locally
Hugging Face Text Embeddings Inference - high-throughput embedding sidecar
Qdrant documentation - vector database
Memgraph documentation - graph database
Dagster documentation - SAGE pipeline orchestration

Other pages in the docs that reference this one:

IntegrationsAdd Engrammic memory to your AI coding tools YAML Config FilesThe YAML configuration files baked into the Engrammic self-hosted image, and how to override them from the host.ConfigurationEnvironment variables for self-hosted Engrammic deployments Running Models LocallyConfigure Engrammic to use local embedding models and LLMs with no cloud API keys

Self-Hosting

Quick Start

Deployment Tiers

Requirements

CLI Reference

Installation Commands

Management Commands

Log Viewing

Configuration

Environment Variables

Choosing the generation LLM

Custom Ports

Upgrading

Backup and Restore

Manual Backup

Restore

Troubleshooting

Health Check

Diagnostics

Common Issues

Manual Setup

Standalone Tiers (Lite/Standard/Pro)

Create directory

Download tier-specific compose file

Create .env

Start services

Configure your editor

Cloud Tier

Create directory

Download compose file

Create .env

Start services

Configure your editor

Architecture

Kubernetes Deployment

Cloud Provider Guides

Get Access

Helpful Links

Related Pages

On this page