611 lines
23 KiB
Markdown
611 lines
23 KiB
Markdown
# Avalon Memory Crystal Server (amcs)
|
|
|
|

|
|
|
|
A Go MCP server for capturing and retrieving thoughts, memory, and project context. Exposes tools over Streamable HTTP, backed by Postgres with pgvector for semantic search.
|
|
|
|
## What it does
|
|
|
|
- **Capture** thoughts with automatic embedding and metadata extraction
|
|
- **Search** thoughts semantically via vector similarity
|
|
- **Organise** thoughts into projects and retrieve full project context
|
|
- **Summarise** and recall memory across topics and time windows
|
|
- **Link** related thoughts and traverse relationships
|
|
|
|
## Stack
|
|
|
|
- Go — MCP server over Streamable HTTP
|
|
- Postgres + pgvector — storage and vector search
|
|
- LiteLLM — primary hosted AI provider (embeddings + metadata extraction)
|
|
- OpenRouter — default upstream behind LiteLLM
|
|
- Ollama — supported local or self-hosted OpenAI-compatible provider
|
|
|
|
## Tools
|
|
|
|
| Tool | Purpose |
|
|
|---|---|
|
|
| `capture_thought` | Store a thought with embedding and metadata |
|
|
| `search_thoughts` | Semantic similarity search |
|
|
| `list_thoughts` | Filter thoughts by type, topic, person, date |
|
|
| `thought_stats` | Counts and top topics/people |
|
|
| `get_thought` | Retrieve a thought by ID |
|
|
| `update_thought` | Patch content or metadata |
|
|
| `delete_thought` | Hard delete |
|
|
| `archive_thought` | Soft delete |
|
|
| `create_project` | Register a named project |
|
|
| `list_projects` | List projects with thought counts |
|
|
| `get_project_context` | Recent + semantic context for a project; uses explicit `project` or the active session project |
|
|
| `set_active_project` | Set session project scope; requires a stateful MCP session |
|
|
| `get_active_project` | Get current session project |
|
|
| `summarize_thoughts` | LLM prose summary over a filtered set |
|
|
| `recall_context` | Semantic + recency context block for injection |
|
|
| `link_thoughts` | Create a typed relationship between thoughts |
|
|
| `related_thoughts` | Explicit links + semantic neighbours |
|
|
| `upload_file` | Stage a file from a server-side path or base64 and get an `amcs://files/{id}` resource URI |
|
|
| `save_file` | Store a file (base64 or resource URI) and optionally link it to a thought |
|
|
| `load_file` | Retrieve a stored file by ID; returns metadata, base64 content, and an embedded MCP binary resource |
|
|
| `list_files` | Browse stored files by thought, project, or kind |
|
|
| `backfill_embeddings` | Generate missing embeddings for stored thoughts |
|
|
| `reparse_thought_metadata` | Re-extract and normalize metadata for stored thoughts |
|
|
| `retry_failed_metadata` | Retry metadata extraction for thoughts still pending or failed |
|
|
| `add_skill` | Store a reusable agent skill (behavioural instruction or capability prompt) |
|
|
| `remove_skill` | Delete an agent skill by id |
|
|
| `list_skills` | List all agent skills, optionally filtered by tag |
|
|
| `add_guardrail` | Store a reusable agent guardrail (constraint or safety rule) |
|
|
| `remove_guardrail` | Delete an agent guardrail by id |
|
|
| `list_guardrails` | List all agent guardrails, optionally filtered by tag or severity |
|
|
| `add_project_skill` | Link an agent skill to a project; pass `project` explicitly if your client does not preserve MCP sessions |
|
|
| `remove_project_skill` | Unlink an agent skill from a project; pass `project` explicitly if your client does not preserve MCP sessions |
|
|
| `list_project_skills` | List all skills linked to a project; pass `project` explicitly if your client does not preserve MCP sessions |
|
|
| `add_project_guardrail` | Link an agent guardrail to a project; pass `project` explicitly if your client does not preserve MCP sessions |
|
|
| `remove_project_guardrail` | Unlink an agent guardrail from a project; pass `project` explicitly if your client does not preserve MCP sessions |
|
|
| `list_project_guardrails` | List all guardrails linked to a project; pass `project` explicitly if your client does not preserve MCP sessions |
|
|
| `get_version_info` | Return the server build version information, including version, tag name, commit, and build date |
|
|
| `describe_tools` | List all available MCP tools with names, descriptions, categories, and model-authored usage notes; call this at the start of a session to orient yourself |
|
|
| `annotate_tool` | Persist your own usage notes for a specific tool; notes are returned by `describe_tools` in future sessions |
|
|
|
|
## Self-Documenting Tools
|
|
|
|
AMCS includes a built-in tool directory that models can read and annotate.
|
|
|
|
**`describe_tools`** returns every registered tool with its name, description, category, and any model-written notes. Call it with no arguments to get the full list, or filter by category:
|
|
|
|
```json
|
|
{ "category": "thoughts" }
|
|
```
|
|
|
|
Available categories: `system`, `thoughts`, `projects`, `files`, `admin`, `household`, `maintenance`, `calendar`, `meals`, `crm`, `skills`, `chat`, `meta`.
|
|
|
|
**`annotate_tool`** lets a model write persistent usage notes against a tool name. Notes survive across sessions and are returned by `describe_tools`:
|
|
|
|
```json
|
|
{ "tool_name": "capture_thought", "notes": "Always pass project explicitly — session state is not reliable in this client." }
|
|
```
|
|
|
|
Pass an empty string to clear notes. The intended workflow is:
|
|
|
|
1. At the start of a session, call `describe_tools` to discover tools and read accumulated notes.
|
|
2. As you learn something non-obvious about a tool — a gotcha, a workflow pattern, a required field ordering — call `annotate_tool` to record it.
|
|
3. Future sessions receive the annotation automatically via `describe_tools`.
|
|
|
|
## MCP Error Contract
|
|
|
|
AMCS returns structured JSON-RPC errors for common MCP failures. Clients should branch on both `error.code` and `error.data.type` instead of parsing the human-readable message.
|
|
|
|
### Stable error codes
|
|
|
|
| Code | `data.type` | Meaning |
|
|
|---|---|---|
|
|
| `-32602` | `invalid_arguments` | MCP argument/schema validation failed before the tool handler ran |
|
|
| `-32602` | `invalid_input` | Tool-level input validation failed inside the handler |
|
|
| `-32050` | `session_required` | Tool requires a stateful MCP session |
|
|
| `-32051` | `project_required` | No explicit `project` was provided and no active session project was available |
|
|
| `-32052` | `project_not_found` | The referenced project does not exist |
|
|
| `-32053` | `invalid_id` | A UUID-like identifier was malformed |
|
|
| `-32054` | `entity_not_found` | A referenced entity such as a thought or contact does not exist |
|
|
|
|
### Error data shape
|
|
|
|
AMCS may include these fields in `error.data`:
|
|
|
|
- `type` — stable machine-readable error type
|
|
- `field` — single argument name such as `name`, `project`, or `thought_id`
|
|
- `fields` — multiple argument names for one-of or mutually-exclusive validation
|
|
- `value` — offending value when safe to expose
|
|
- `detail` — validation detail such as `required`, `invalid`, `one_of_required`, `mutually_exclusive`, or a schema validation message
|
|
- `hint` — remediation guidance
|
|
- `entity` — entity name for generic not-found errors
|
|
|
|
Example schema-level error:
|
|
|
|
```json
|
|
{
|
|
"code": -32602,
|
|
"message": "invalid tool arguments",
|
|
"data": {
|
|
"type": "invalid_arguments",
|
|
"field": "name",
|
|
"detail": "validating root: required: missing properties: [\"name\"]",
|
|
"hint": "check the name argument"
|
|
}
|
|
}
|
|
```
|
|
|
|
Example tool-level error:
|
|
|
|
```json
|
|
{
|
|
"code": -32051,
|
|
"message": "project is required; pass project explicitly or call set_active_project in this MCP session first",
|
|
"data": {
|
|
"type": "project_required",
|
|
"field": "project",
|
|
"hint": "pass project explicitly or call set_active_project in this MCP session first"
|
|
}
|
|
}
|
|
```
|
|
|
|
### Client example
|
|
|
|
Go client example handling AMCS MCP errors:
|
|
|
|
```go
|
|
result, err := session.CallTool(ctx, &mcp.CallToolParams{
|
|
Name: "get_project_context",
|
|
Arguments: map[string]any{},
|
|
})
|
|
if err != nil {
|
|
var rpcErr *jsonrpc.Error
|
|
if errors.As(err, &rpcErr) {
|
|
var data struct {
|
|
Type string `json:"type"`
|
|
Field string `json:"field"`
|
|
Hint string `json:"hint"`
|
|
}
|
|
_ = json.Unmarshal(rpcErr.Data, &data)
|
|
|
|
switch {
|
|
case rpcErr.Code == -32051 && data.Type == "project_required":
|
|
// Retry with an explicit project, or call set_active_project first.
|
|
case rpcErr.Code == -32602 && data.Type == "invalid_arguments":
|
|
// Ask the caller to fix the malformed arguments.
|
|
}
|
|
}
|
|
}
|
|
_ = result
|
|
```
|
|
|
|
## Build Versioning
|
|
|
|
AMCS embeds build metadata into the binary at build time.
|
|
|
|
- `version` is generated from the current git tag when building from a tagged commit
|
|
- `tag_name` is the repo tag name, for example `v1.0.1`
|
|
- `build_date` is the UTC build timestamp in RFC3339 format
|
|
- `commit` is the short git commit SHA
|
|
|
|
For untagged builds, `version` and `tag_name` fall back to `dev`.
|
|
|
|
Use `get_version_info` to retrieve the runtime build metadata:
|
|
|
|
```json
|
|
{
|
|
"server_name": "amcs",
|
|
"version": "v1.0.1",
|
|
"tag_name": "v1.0.1",
|
|
"commit": "abc1234",
|
|
"build_date": "2026-03-31T14:22:10Z"
|
|
}
|
|
```
|
|
|
|
## Agent Skills and Guardrails
|
|
|
|
Skills and guardrails are reusable agent behaviour instructions and constraints that can be attached to projects.
|
|
|
|
**At the start of every project session, always call `list_project_skills` and `list_project_guardrails` first.** Use the returned skills and guardrails to guide agent behaviour for that project. Only generate or create new skills/guardrails if none are returned. If your MCP client does not preserve sessions across calls, pass `project` explicitly instead of relying on `set_active_project`.
|
|
|
|
### Skills
|
|
|
|
A skill is a reusable behavioural instruction or capability prompt — for example, "always respond in structured markdown" or "break complex tasks into numbered steps before starting".
|
|
|
|
```json
|
|
{ "name": "structured-output", "description": "Enforce markdown output format", "content": "Always structure responses using markdown headers and bullet points.", "tags": ["formatting"] }
|
|
```
|
|
|
|
### Guardrails
|
|
|
|
A guardrail is a constraint or safety rule — for example, "never delete files without explicit confirmation" or "do not expose secrets in output".
|
|
|
|
```json
|
|
{ "name": "no-silent-deletes", "description": "Require confirmation before deletes", "content": "Never delete, drop, or truncate data without first confirming with the user.", "severity": "high", "tags": ["safety"] }
|
|
```
|
|
|
|
Severity levels: `low`, `medium`, `high`, `critical`.
|
|
|
|
### Project linking
|
|
|
|
Link existing skills and guardrails to a project so they are automatically available when that project is active:
|
|
|
|
```json
|
|
{ "project": "my-project", "skill_id": "<uuid>" }
|
|
{ "project": "my-project", "guardrail_id": "<uuid>" }
|
|
```
|
|
|
|
## Configuration
|
|
|
|
Config is YAML-driven. Copy `configs/config.example.yaml` and set:
|
|
|
|
- `database.url` — Postgres connection string
|
|
- `auth.mode` — `api_keys` or `oauth_client_credentials`
|
|
- `auth.keys` — API keys for MCP access via `x-brain-key` or `Authorization: Bearer <key>` when `auth.mode=api_keys`
|
|
- `auth.oauth.clients` — client registry when `auth.mode=oauth_client_credentials`
|
|
- `mcp.version` is build-generated and should not be set in config
|
|
|
|
**OAuth Client Credentials flow** (`auth.mode=oauth_client_credentials`):
|
|
|
|
1. Obtain a token — `POST /oauth/token` (public, no auth required):
|
|
```
|
|
POST /oauth/token
|
|
Content-Type: application/x-www-form-urlencoded
|
|
Authorization: Basic base64(client_id:client_secret)
|
|
|
|
grant_type=client_credentials
|
|
```
|
|
Returns: `{"access_token": "...", "token_type": "bearer", "expires_in": 3600}`
|
|
|
|
2. Use the token on the MCP endpoint:
|
|
```
|
|
Authorization: Bearer <access_token>
|
|
```
|
|
|
|
Alternatively, pass `client_id` and `client_secret` as body parameters instead of `Authorization: Basic`. Direct `Authorization: Basic` credential validation on the MCP endpoint is also supported as a fallback (no token required).
|
|
- `ai.litellm.base_url` and `ai.litellm.api_key` — LiteLLM proxy
|
|
- `ai.ollama.base_url` and `ai.ollama.api_key` — Ollama local or remote server
|
|
|
|
See `llm/plan.md` for an audited high-level status summary of the original implementation plan, and `llm/todo.md` for the audited backfill/fallback follow-up status.
|
|
|
|
## Backfill
|
|
|
|
Run `backfill_embeddings` after switching embedding models or importing thoughts without vectors.
|
|
|
|
```json
|
|
{
|
|
"project": "optional-project-name",
|
|
"limit": 100,
|
|
"include_archived": false,
|
|
"older_than_days": 0,
|
|
"dry_run": false
|
|
}
|
|
```
|
|
|
|
- `dry_run: true` — report counts without calling the embedding provider
|
|
- `limit` — max thoughts per call (default 100)
|
|
- Embeddings are generated in parallel (4 workers) and upserted; one failure does not abort the run
|
|
|
|
## Metadata Reparse
|
|
|
|
Run `reparse_thought_metadata` to fix stale or inconsistent metadata by re-extracting it from thought content.
|
|
|
|
```json
|
|
{
|
|
"project": "optional-project-name",
|
|
"limit": 100,
|
|
"include_archived": false,
|
|
"older_than_days": 0,
|
|
"dry_run": false
|
|
}
|
|
```
|
|
|
|
- `dry_run: true` scans only and does not call metadata extraction or write updates
|
|
- If extraction fails for a thought, existing metadata is normalized and written only if it changes
|
|
- Metadata reparse runs in parallel (4 workers); one failure does not abort the run
|
|
|
|
## Failed Metadata Retry
|
|
|
|
`capture_thought` now stores the thought even when metadata extraction times out or fails. Those thoughts are marked with `metadata_status: "pending"` and retried in the background. Use `retry_failed_metadata` to sweep any thoughts still marked `pending` or `failed`.
|
|
|
|
```json
|
|
{
|
|
"project": "optional-project-name",
|
|
"limit": 100,
|
|
"include_archived": false,
|
|
"older_than_days": 1,
|
|
"dry_run": false
|
|
}
|
|
```
|
|
|
|
- `dry_run: true` scans only and does not call metadata extraction or write updates
|
|
- successful retries mark the thought metadata as `complete` and clear the last error
|
|
- failed retries update the retry markers so the daily sweep can pick them up again later
|
|
|
|
## File Storage
|
|
|
|
Files can optionally be linked to a thought by passing `thought_id`, which also adds an attachment reference to that thought's metadata. AI clients should prefer `save_file` when the goal is to retain the artifact itself, rather than reading or summarizing the file first. Stored files and attachment metadata are not forwarded to the metadata extraction client.
|
|
|
|
### MCP tools
|
|
|
|
**Stage a file and get a URI** (`upload_file`) — preferred for large or binary files:
|
|
|
|
```json
|
|
{
|
|
"name": "diagram.png",
|
|
"content_path": "/absolute/path/to/diagram.png"
|
|
}
|
|
```
|
|
|
|
Or with base64 for small files (≤10 MB):
|
|
|
|
```json
|
|
{
|
|
"name": "diagram.png",
|
|
"content_base64": "<base64-payload>"
|
|
}
|
|
```
|
|
|
|
Returns `{"file": {...}, "uri": "amcs://files/<id>"}`. Pass `thought_id`/`project` to link immediately, or omit them and use the URI in a later `save_file` call.
|
|
|
|
**Link a staged file to a thought** (`save_file` with `content_uri`):
|
|
|
|
```json
|
|
{
|
|
"name": "meeting-notes.pdf",
|
|
"thought_id": "optional-thought-uuid",
|
|
"content_uri": "amcs://files/<id-from-upload_file>"
|
|
}
|
|
```
|
|
|
|
**Save small files inline** (`save_file` with `content_base64`, ≤10 MB):
|
|
|
|
```json
|
|
{
|
|
"name": "meeting-notes.pdf",
|
|
"media_type": "application/pdf",
|
|
"kind": "document",
|
|
"thought_id": "optional-thought-uuid",
|
|
"content_base64": "<base64-payload>"
|
|
}
|
|
```
|
|
|
|
`content_base64` and `content_uri` are mutually exclusive in both tools.
|
|
|
|
**Load a file** — returns metadata, base64 content, and an embedded MCP binary resource (`amcs://files/{id}`). The `id` field accepts either the bare stored file UUID or the full `amcs://files/{id}` URI:
|
|
|
|
```json
|
|
{ "id": "stored-file-uuid" }
|
|
```
|
|
|
|
**List files** for a thought or project:
|
|
|
|
```json
|
|
{
|
|
"thought_id": "optional-thought-uuid",
|
|
"project": "optional-project-name",
|
|
"kind": "optional-image-document-audio-file",
|
|
"limit": 20
|
|
}
|
|
```
|
|
|
|
### MCP resources
|
|
|
|
Stored files are also exposed as MCP resources at `amcs://files/{id}`. MCP clients can read raw binary content directly via `resources/read` without going through `load_file`.
|
|
|
|
### HTTP upload and download
|
|
|
|
Direct HTTP access avoids base64 encoding entirely. The Go server caps `/files` uploads at 100 MB per request. Large uploads are also subject to available memory, Postgres limits, and any reverse proxy or load balancer in front of AMCS.
|
|
|
|
Multipart upload:
|
|
|
|
```bash
|
|
curl -X POST http://localhost:8080/files \
|
|
-H "x-brain-key: <key>" \
|
|
-F "file=@./diagram.png" \
|
|
-F "project=amcs" \
|
|
-F "kind=image"
|
|
```
|
|
|
|
Raw body upload:
|
|
|
|
```bash
|
|
curl -X POST "http://localhost:8080/files?project=amcs&name=meeting-notes.pdf" \
|
|
-H "x-brain-key: <key>" \
|
|
-H "Content-Type: application/pdf" \
|
|
--data-binary @./meeting-notes.pdf
|
|
```
|
|
|
|
Binary download:
|
|
|
|
```bash
|
|
curl http://localhost:8080/files/<id> \
|
|
-H "x-brain-key: <key>" \
|
|
-o meeting-notes.pdf
|
|
```
|
|
|
|
**Automatic backfill** (optional, config-gated):
|
|
|
|
```yaml
|
|
backfill:
|
|
enabled: true
|
|
run_on_startup: true # run once on server start
|
|
interval: "15m" # repeat every 15 minutes
|
|
batch_size: 20
|
|
max_per_run: 100
|
|
include_archived: false
|
|
```
|
|
|
|
```yaml
|
|
metadata_retry:
|
|
enabled: true
|
|
run_on_startup: true # retry failed metadata once on server start
|
|
interval: "24h" # retry pending/failed metadata daily
|
|
max_per_run: 100
|
|
include_archived: false
|
|
```
|
|
|
|
**Search fallback**: when no embeddings exist for the active model in scope, `search_thoughts`, `recall_context`, `get_project_context`, `summarize_thoughts`, and `related_thoughts` automatically fall back to Postgres full-text search so results are never silently empty.
|
|
|
|
## Client Setup
|
|
|
|
### Claude Code
|
|
|
|
```bash
|
|
# API key auth
|
|
claude mcp add --transport http amcs http://localhost:8080/mcp --header "x-brain-key: <key>"
|
|
|
|
# Bearer token auth
|
|
claude mcp add --transport http amcs http://localhost:8080/mcp --header "Authorization: Bearer <token>"
|
|
```
|
|
|
|
### OpenAI Codex
|
|
|
|
Add to `~/.codex/config.toml`:
|
|
|
|
```toml
|
|
[[mcp_servers]]
|
|
name = "amcs"
|
|
url = "http://localhost:8080/mcp"
|
|
|
|
[mcp_servers.headers]
|
|
x-brain-key = "<key>"
|
|
```
|
|
|
|
### OpenCode
|
|
|
|
```bash
|
|
# API key auth
|
|
opencode mcp add --name amcs --type remote --url http://localhost:8080/mcp --header "x-brain-key=<key>"
|
|
|
|
# Bearer token auth
|
|
opencode mcp add --name amcs --type remote --url http://localhost:8080/mcp --header "Authorization=Bearer <token>"
|
|
```
|
|
|
|
Or add directly to `opencode.json` / `~/.config/opencode/config.json`:
|
|
|
|
```json
|
|
{
|
|
"mcp": {
|
|
"amcs": {
|
|
"type": "remote",
|
|
"url": "http://localhost:8080/mcp",
|
|
"headers": {
|
|
"x-brain-key": "<key>"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## Apache Proxy
|
|
|
|
If AMCS is deployed behind Apache HTTP Server, configure the proxy explicitly for larger uploads and longer-running requests.
|
|
|
|
Example virtual host settings for the current AMCS defaults:
|
|
|
|
```apache
|
|
<VirtualHost *:443>
|
|
ServerName amcs.example.com
|
|
|
|
ProxyPreserveHost On
|
|
LimitRequestBody 104857600
|
|
RequestReadTimeout handshake=0 header=20-40,MinRate=500 body=600,MinRate=500
|
|
Timeout 600
|
|
ProxyTimeout 600
|
|
|
|
ProxyPass /mcp http://127.0.0.1:8080/mcp connectiontimeout=30 timeout=600
|
|
ProxyPassReverse /mcp http://127.0.0.1:8080/mcp
|
|
|
|
ProxyPass /files http://127.0.0.1:8080/files connectiontimeout=30 timeout=600
|
|
ProxyPassReverse /files http://127.0.0.1:8080/files
|
|
</VirtualHost>
|
|
```
|
|
|
|
Recommended Apache settings:
|
|
|
|
- `LimitRequestBody 104857600` matches AMCS's 100 MB `/files` upload cap.
|
|
- `RequestReadTimeout ... body=600` gives clients up to 10 minutes to send larger request bodies.
|
|
- `ProxyTimeout 600` and `ProxyPass ... timeout=600` give Apache enough time to wait for the Go backend.
|
|
- If another proxy or load balancer sits in front of Apache, align its size and timeout settings too.
|
|
|
|
## Development
|
|
|
|
Run the SQL migrations against a local database with:
|
|
|
|
`DATABASE_URL=postgres://... make migrate`
|
|
|
|
### Backend + embedded UI build
|
|
|
|
The web UI now lives in the top-level `ui/` module and is embedded into the Go binary at build time with `go:embed`.
|
|
|
|
- `make build` — builds the Svelte/Tailwind frontend, then compiles the Go server
|
|
- `make test` — runs `svelte-check` for the frontend and `go test ./...` for the backend
|
|
- `make ui-build` — builds only the frontend into `internal/app/ui/dist`
|
|
- `make ui-dev` — starts the Vite dev server with hot reload on `http://localhost:5173`
|
|
|
|
### Local UI workflow
|
|
|
|
For the normal production-style local flow:
|
|
|
|
1. Start the backend: `./scripts/run-local.sh configs/dev.yaml`
|
|
2. Open `http://localhost:8080`
|
|
|
|
For frontend iteration with hot reload and no Go rebuilds:
|
|
|
|
1. Start the backend once: `go run ./cmd/amcs-server --config configs/dev.yaml`
|
|
2. In another shell start the UI dev server: `make ui-dev`
|
|
3. Open `http://localhost:5173`
|
|
|
|
The Vite dev server proxies backend routes such as `/api/status`, `/llm`, `/healthz`, `/readyz`, `/files`, `/mcp`, and the OAuth endpoints back to the Go server on `http://127.0.0.1:8080` by default. Override that target with `AMCS_UI_BACKEND` if needed.
|
|
|
|
The root page (`/`) is now the Svelte frontend. It preserves the existing landing-page content and status information by fetching data from `GET /api/status`.
|
|
|
|
LLM integration instructions are still served at `/llm`.
|
|
|
|
## Containers
|
|
|
|
The repo now includes a `Dockerfile` and Compose files for running the app with Postgres + pgvector.
|
|
|
|
1. Set a real LiteLLM key in your shell:
|
|
`export AMCS_LITELLM_API_KEY=your-key`
|
|
2. Start the stack with your runtime:
|
|
`docker compose -f docker-compose.yml -f docker-compose.docker.yml up --build`
|
|
`podman compose -f docker-compose.yml up --build`
|
|
3. Call the service on `http://localhost:8080`
|
|
|
|
Notes:
|
|
|
|
- The app uses `configs/docker.yaml` inside the container.
|
|
- The local `./configs` directory is mounted into `/app/configs`, so config edits apply without rebuilding the image.
|
|
- `AMCS_LITELLM_BASE_URL` overrides the LiteLLM endpoint, so you can retarget it without editing YAML.
|
|
- `AMCS_OLLAMA_BASE_URL` overrides the Ollama endpoint for local or remote servers.
|
|
- The Compose stack uses a default bridge network named `amcs`.
|
|
- The base Compose file uses `host.containers.internal`, which is Podman-friendly.
|
|
- The Docker override file adds `host-gateway` aliases so Docker can resolve the same host endpoint.
|
|
- Database migrations `001` through `005` run automatically when the Postgres volume is created for the first time.
|
|
- `migrations/006_rls_and_grants.sql` is intentionally skipped during container bootstrap because it contains deployment-specific grants for a role named `amcs_user`.
|
|
|
|
## Ollama
|
|
|
|
Set `ai.provider: "ollama"` to use a local or self-hosted Ollama server through its OpenAI-compatible API.
|
|
|
|
Example:
|
|
|
|
```yaml
|
|
ai:
|
|
provider: "ollama"
|
|
embeddings:
|
|
model: "nomic-embed-text"
|
|
dimensions: 768
|
|
metadata:
|
|
model: "llama3.2"
|
|
temperature: 0.1
|
|
ollama:
|
|
base_url: "http://localhost:11434/v1"
|
|
api_key: "ollama"
|
|
request_headers: {}
|
|
```
|
|
|
|
Notes:
|
|
|
|
- For remote Ollama servers, point `ai.ollama.base_url` at the remote `/v1` endpoint.
|
|
- The client always sends Bearer auth; Ollama ignores it locally, so `api_key: "ollama"` is a safe default.
|
|
- `ai.embeddings.dimensions` must match the embedding model you actually use, or startup will fail the database vector-dimension check.
|