wdevs/amcs

Files

Hein (Warky) 7f2b2b9fee feat(files): implement file storage functionality with save, load, and list operations

2026-03-30 22:24:18 +02:00

8.9 KiB

Raw Blame History

Avalon Memory Crystal Server (amcs)

A Go MCP server for capturing and retrieving thoughts, memory, and project context. Exposes tools over Streamable HTTP, backed by Postgres with pgvector for semantic search.

What it does

Capture thoughts with automatic embedding and metadata extraction
Search thoughts semantically via vector similarity
Organise thoughts into projects and retrieve full project context
Summarise and recall memory across topics and time windows
Link related thoughts and traverse relationships

Stack

Go — MCP server over Streamable HTTP
Postgres + pgvector — storage and vector search
LiteLLM — primary hosted AI provider (embeddings + metadata extraction)
OpenRouter — default upstream behind LiteLLM
Ollama — supported local or self-hosted OpenAI-compatible provider

Tools

Tool	Purpose
`capture_thought`	Store a thought with embedding and metadata
`search_thoughts`	Semantic similarity search
`list_thoughts`	Filter thoughts by type, topic, person, date
`thought_stats`	Counts and top topics/people
`get_thought`	Retrieve a thought by ID
`update_thought`	Patch content or metadata
`delete_thought`	Hard delete
`archive_thought`	Soft delete
`create_project`	Register a named project
`list_projects`	List projects with thought counts
`get_project_context`	Recent + semantic context for a project
`set_active_project`	Set session project scope
`get_active_project`	Get current session project
`summarize_thoughts`	LLM prose summary over a filtered set
`recall_context`	Semantic + recency context block for injection
`link_thoughts`	Create a typed relationship between thoughts
`related_thoughts`	Explicit links + semantic neighbours
`save_file`	Store a base64-encoded image, document, audio file, or other binary and optionally link it to a thought
`load_file`	Retrieve a stored file by ID as base64 plus metadata
`list_files`	Browse stored files by thought, project, or kind
`backfill_embeddings`	Generate missing embeddings for stored thoughts
`reparse_thought_metadata`	Re-extract and normalize metadata for stored thoughts

Configuration

Config is YAML-driven. Copy configs/config.example.yaml and set:

database.url — Postgres connection string
auth.mode — api_keys or oauth_client_credentials
auth.keys — API keys for MCP access via x-brain-key or Authorization: Bearer <key> when auth.mode=api_keys
auth.oauth.clients — client registry when auth.mode=oauth_client_credentials

OAuth Client Credentials flow (auth.mode=oauth_client_credentials):

Obtain a token — POST /oauth/token (public, no auth required):

POST /oauth/token
Content-Type: application/x-www-form-urlencoded
Authorization: Basic base64(client_id:client_secret)

grant_type=client_credentials

Returns: {"access_token": "...", "token_type": "bearer", "expires_in": 3600}

Use the token on the MCP endpoint:
```
Authorization: Bearer <access_token>
```

Alternatively, pass client_id and client_secret as body parameters instead of Authorization: Basic. Direct Authorization: Basic credential validation on the MCP endpoint is also supported as a fallback (no token required).

ai.litellm.base_url and ai.litellm.api_key — LiteLLM proxy
ai.ollama.base_url and ai.ollama.api_key — Ollama local or remote server

See llm/plan.md for full architecture and implementation plan.

Backfill

Run backfill_embeddings after switching embedding models or importing thoughts without vectors.

{
  "project": "optional-project-name",
  "limit": 100,
  "include_archived": false,
  "older_than_days": 0,
  "dry_run": false
}

dry_run: true — report counts without calling the embedding provider
limit — max thoughts per call (default 100)
Embeddings are generated in parallel (4 workers) and upserted; one failure does not abort the run

Metadata Reparse

Run reparse_thought_metadata to fix stale or inconsistent metadata by re-extracting it from thought content.

{
  "project": "optional-project-name",
  "limit": 100,
  "include_archived": false,
  "older_than_days": 0,
  "dry_run": false
}

dry_run: true scans only and does not call metadata extraction or write updates
If extraction fails for a thought, existing metadata is normalized and written only if it changes
Metadata reparse runs in parallel (4 workers); one failure does not abort the run

File Storage

Use save_file to persist binary files as base64. Files can optionally be linked to a memory by passing thought_id, which also adds an attachment reference to that thought's metadata.

{
  "name": "meeting-notes.pdf",
  "media_type": "application/pdf",
  "kind": "document",
  "thought_id": "optional-thought-uuid",
  "content_base64": "<base64-payload>"
}

Load a stored file again with:

{
  "id": "stored-file-uuid"
}

List files for a thought or project with:

{
  "thought_id": "optional-thought-uuid",
  "project": "optional-project-name",
  "kind": "optional-image-document-audio-file",
  "limit": 20
}

Automatic backfill (optional, config-gated):

backfill:
  enabled: true
  run_on_startup: true   # run once on server start
  interval: "15m"        # repeat every 15 minutes
  batch_size: 20
  max_per_run: 100
  include_archived: false

Search fallback: when no embeddings exist for the active model in scope, search_thoughts, recall_context, get_project_context, summarize_thoughts, and related_thoughts automatically fall back to Postgres full-text search so results are never silently empty.

Client Setup

Claude Code

# API key auth
claude mcp add --transport http amcs http://localhost:8080/mcp --header "x-brain-key: <key>"

# Bearer token auth
claude mcp add --transport http amcs http://localhost:8080/mcp --header "Authorization: Bearer <token>"

OpenAI Codex

Add to ~/.codex/config.toml:

[[mcp_servers]]
name = "amcs"
url  = "http://localhost:8080/mcp"

[mcp_servers.headers]
x-brain-key = "<key>"

OpenCode

# API key auth
opencode mcp add --name amcs --type remote --url http://localhost:8080/mcp --header "x-brain-key=<key>"

# Bearer token auth
opencode mcp add --name amcs --type remote --url http://localhost:8080/mcp --header "Authorization=Bearer <token>"

Or add directly to opencode.json / ~/.config/opencode/config.json:

{
  "mcp": {
    "amcs": {
      "type": "remote",
      "url": "http://localhost:8080/mcp",
      "headers": {
        "x-brain-key": "<key>"
      }
    }
  }
}

Development

Run the SQL migrations against a local database with:

DATABASE_URL=postgres://... make migrate

LLM integration instructions are served at /llm.

Containers

The repo now includes a Dockerfile and Compose files for running the app with Postgres + pgvector.

Set a real LiteLLM key in your shell: export AMCS_LITELLM_API_KEY=your-key
Start the stack with your runtime: docker compose -f docker-compose.yml -f docker-compose.docker.yml up --build podman compose -f docker-compose.yml up --build
Call the service on http://localhost:8080

Notes:

The app uses configs/docker.yaml inside the container.
The local ./configs directory is mounted into /app/configs, so config edits apply without rebuilding the image.
AMCS_LITELLM_BASE_URL overrides the LiteLLM endpoint, so you can retarget it without editing YAML.
AMCS_OLLAMA_BASE_URL overrides the Ollama endpoint for local or remote servers.
The Compose stack uses a default bridge network named amcs.
The base Compose file uses host.containers.internal, which is Podman-friendly.
The Docker override file adds host-gateway aliases so Docker can resolve the same host endpoint.
Database migrations 001 through 005 run automatically when the Postgres volume is created for the first time.
migrations/006_rls_and_grants.sql is intentionally skipped during container bootstrap because it contains deployment-specific grants for a role named amcs_user.

Ollama

Set ai.provider: "ollama" to use a local or self-hosted Ollama server through its OpenAI-compatible API.

Example:

ai:
  provider: "ollama"
  embeddings:
    model: "nomic-embed-text"
    dimensions: 768
  metadata:
    model: "llama3.2"
    temperature: 0.1
  ollama:
    base_url: "http://localhost:11434/v1"
    api_key: "ollama"
    request_headers: {}