Files
amcs/README.md
Hein (Warky) 72b4f7ce3d feat: implement file upload handler and related functionality
- Added file upload handler to process both multipart and raw file uploads.
- Implemented parsing logic for upload requests, including handling file metadata.
- Introduced SaveFileDecodedInput structure for handling decoded file uploads.
- Created unit tests for file upload parsing and validation.

feat: add metadata retry configuration and functionality

- Introduced MetadataRetryConfig to the application configuration.
- Implemented MetadataRetryer to handle retrying metadata extraction for thoughts.
- Added new tool for retrying failed metadata extractions.
- Updated thought metadata structure to include status and timestamps for metadata processing.

fix: enhance metadata normalization and error handling

- Updated metadata normalization functions to track status and errors.
- Improved handling of metadata extraction failures during thought updates and captures.
- Ensured that metadata status is correctly set during various operations.

refactor: streamline file saving logic in FilesTool

- Refactored Save method in FilesTool to utilize new SaveDecoded method.
- Simplified project and thought ID resolution logic during file saving.
2026-03-30 22:57:21 +02:00

10 KiB

Avalon Memory Crystal Server (amcs)

Avalon Memory Crystal

A Go MCP server for capturing and retrieving thoughts, memory, and project context. Exposes tools over Streamable HTTP, backed by Postgres with pgvector for semantic search.

What it does

  • Capture thoughts with automatic embedding and metadata extraction
  • Search thoughts semantically via vector similarity
  • Organise thoughts into projects and retrieve full project context
  • Summarise and recall memory across topics and time windows
  • Link related thoughts and traverse relationships

Stack

  • Go — MCP server over Streamable HTTP
  • Postgres + pgvector — storage and vector search
  • LiteLLM — primary hosted AI provider (embeddings + metadata extraction)
  • OpenRouter — default upstream behind LiteLLM
  • Ollama — supported local or self-hosted OpenAI-compatible provider

Tools

Tool Purpose
capture_thought Store a thought with embedding and metadata
search_thoughts Semantic similarity search
list_thoughts Filter thoughts by type, topic, person, date
thought_stats Counts and top topics/people
get_thought Retrieve a thought by ID
update_thought Patch content or metadata
delete_thought Hard delete
archive_thought Soft delete
create_project Register a named project
list_projects List projects with thought counts
get_project_context Recent + semantic context for a project
set_active_project Set session project scope
get_active_project Get current session project
summarize_thoughts LLM prose summary over a filtered set
recall_context Semantic + recency context block for injection
link_thoughts Create a typed relationship between thoughts
related_thoughts Explicit links + semantic neighbours
save_file Store a base64-encoded image, document, audio file, or other binary and optionally link it to a thought
load_file Retrieve a stored file by ID as base64 plus metadata
list_files Browse stored files by thought, project, or kind
backfill_embeddings Generate missing embeddings for stored thoughts
reparse_thought_metadata Re-extract and normalize metadata for stored thoughts
retry_failed_metadata Retry metadata extraction for thoughts still pending or failed

Configuration

Config is YAML-driven. Copy configs/config.example.yaml and set:

  • database.url — Postgres connection string
  • auth.modeapi_keys or oauth_client_credentials
  • auth.keys — API keys for MCP access via x-brain-key or Authorization: Bearer <key> when auth.mode=api_keys
  • auth.oauth.clients — client registry when auth.mode=oauth_client_credentials

OAuth Client Credentials flow (auth.mode=oauth_client_credentials):

  1. Obtain a token — POST /oauth/token (public, no auth required):

    POST /oauth/token
    Content-Type: application/x-www-form-urlencoded
    Authorization: Basic base64(client_id:client_secret)
    
    grant_type=client_credentials
    

    Returns: {"access_token": "...", "token_type": "bearer", "expires_in": 3600}

  2. Use the token on the MCP endpoint:

    Authorization: Bearer <access_token>
    

Alternatively, pass client_id and client_secret as body parameters instead of Authorization: Basic. Direct Authorization: Basic credential validation on the MCP endpoint is also supported as a fallback (no token required).

  • ai.litellm.base_url and ai.litellm.api_key — LiteLLM proxy
  • ai.ollama.base_url and ai.ollama.api_key — Ollama local or remote server

See llm/plan.md for full architecture and implementation plan.

Backfill

Run backfill_embeddings after switching embedding models or importing thoughts without vectors.

{
  "project": "optional-project-name",
  "limit": 100,
  "include_archived": false,
  "older_than_days": 0,
  "dry_run": false
}
  • dry_run: true — report counts without calling the embedding provider
  • limit — max thoughts per call (default 100)
  • Embeddings are generated in parallel (4 workers) and upserted; one failure does not abort the run

Metadata Reparse

Run reparse_thought_metadata to fix stale or inconsistent metadata by re-extracting it from thought content.

{
  "project": "optional-project-name",
  "limit": 100,
  "include_archived": false,
  "older_than_days": 0,
  "dry_run": false
}
  • dry_run: true scans only and does not call metadata extraction or write updates
  • If extraction fails for a thought, existing metadata is normalized and written only if it changes
  • Metadata reparse runs in parallel (4 workers); one failure does not abort the run

Failed Metadata Retry

capture_thought now stores the thought even when metadata extraction times out or fails. Those thoughts are marked with metadata_status: "pending" and retried in the background. Use retry_failed_metadata to sweep any thoughts still marked pending or failed.

{
  "project": "optional-project-name",
  "limit": 100,
  "include_archived": false,
  "older_than_days": 1,
  "dry_run": false
}
  • dry_run: true scans only and does not call metadata extraction or write updates
  • successful retries mark the thought metadata as complete and clear the last error
  • failed retries update the retry markers so the daily sweep can pick them up again later

File Storage

Use save_file to persist binary files as base64. Files can optionally be linked to a memory by passing thought_id, which also adds an attachment reference to that thought's metadata.

{
  "name": "meeting-notes.pdf",
  "media_type": "application/pdf",
  "kind": "document",
  "thought_id": "optional-thought-uuid",
  "content_base64": "<base64-payload>"
}

Load a stored file again with:

{
  "id": "stored-file-uuid"
}

List files for a thought or project with:

{
  "thought_id": "optional-thought-uuid",
  "project": "optional-project-name",
  "kind": "optional-image-document-audio-file",
  "limit": 20
}

AMCS also supports direct authenticated HTTP uploads to /files for clients that want to stream file bodies instead of base64-encoding them into an MCP tool call.

Multipart upload:

curl -X POST http://localhost:8080/files \
  -H "x-brain-key: <key>" \
  -F "file=@./diagram.png" \
  -F "project=amcs" \
  -F "kind=image"

Raw body upload:

curl -X POST "http://localhost:8080/files?project=amcs&name=meeting-notes.pdf" \
  -H "x-brain-key: <key>" \
  -H "Content-Type: application/pdf" \
  --data-binary @./meeting-notes.pdf

Automatic backfill (optional, config-gated):

backfill:
  enabled: true
  run_on_startup: true   # run once on server start
  interval: "15m"        # repeat every 15 minutes
  batch_size: 20
  max_per_run: 100
  include_archived: false
metadata_retry:
  enabled: true
  run_on_startup: true   # retry failed metadata once on server start
  interval: "24h"        # retry pending/failed metadata daily
  max_per_run: 100
  include_archived: false

Search fallback: when no embeddings exist for the active model in scope, search_thoughts, recall_context, get_project_context, summarize_thoughts, and related_thoughts automatically fall back to Postgres full-text search so results are never silently empty.

Client Setup

Claude Code

# API key auth
claude mcp add --transport http amcs http://localhost:8080/mcp --header "x-brain-key: <key>"

# Bearer token auth
claude mcp add --transport http amcs http://localhost:8080/mcp --header "Authorization: Bearer <token>"

OpenAI Codex

Add to ~/.codex/config.toml:

[[mcp_servers]]
name = "amcs"
url  = "http://localhost:8080/mcp"

[mcp_servers.headers]
x-brain-key = "<key>"

OpenCode

# API key auth
opencode mcp add --name amcs --type remote --url http://localhost:8080/mcp --header "x-brain-key=<key>"

# Bearer token auth
opencode mcp add --name amcs --type remote --url http://localhost:8080/mcp --header "Authorization=Bearer <token>"

Or add directly to opencode.json / ~/.config/opencode/config.json:

{
  "mcp": {
    "amcs": {
      "type": "remote",
      "url": "http://localhost:8080/mcp",
      "headers": {
        "x-brain-key": "<key>"
      }
    }
  }
}

Development

Run the SQL migrations against a local database with:

DATABASE_URL=postgres://... make migrate

LLM integration instructions are served at /llm.

Containers

The repo now includes a Dockerfile and Compose files for running the app with Postgres + pgvector.

  1. Set a real LiteLLM key in your shell: export AMCS_LITELLM_API_KEY=your-key
  2. Start the stack with your runtime: docker compose -f docker-compose.yml -f docker-compose.docker.yml up --build podman compose -f docker-compose.yml up --build
  3. Call the service on http://localhost:8080

Notes:

  • The app uses configs/docker.yaml inside the container.
  • The local ./configs directory is mounted into /app/configs, so config edits apply without rebuilding the image.
  • AMCS_LITELLM_BASE_URL overrides the LiteLLM endpoint, so you can retarget it without editing YAML.
  • AMCS_OLLAMA_BASE_URL overrides the Ollama endpoint for local or remote servers.
  • The Compose stack uses a default bridge network named amcs.
  • The base Compose file uses host.containers.internal, which is Podman-friendly.
  • The Docker override file adds host-gateway aliases so Docker can resolve the same host endpoint.
  • Database migrations 001 through 005 run automatically when the Postgres volume is created for the first time.
  • migrations/006_rls_and_grants.sql is intentionally skipped during container bootstrap because it contains deployment-specific grants for a role named amcs_user.

Ollama

Set ai.provider: "ollama" to use a local or self-hosted Ollama server through its OpenAI-compatible API.

Example:

ai:
  provider: "ollama"
  embeddings:
    model: "nomic-embed-text"
    dimensions: 768
  metadata:
    model: "llama3.2"
    temperature: 0.1
  ollama:
    base_url: "http://localhost:11434/v1"
    api_key: "ollama"
    request_headers: {}

Notes:

  • For remote Ollama servers, point ai.ollama.base_url at the remote /v1 endpoint.
  • The client always sends Bearer auth; Ollama ignores it locally, so api_key: "ollama" is a safe default.
  • ai.embeddings.dimensions must match the embedding model you actually use, or startup will fail the database vector-dimension check.