8.9 KiB
Avalon Memory Crystal Server (amcs)
A Go MCP server for capturing and retrieving thoughts, memory, and project context. Exposes tools over Streamable HTTP, backed by Postgres with pgvector for semantic search.
What it does
- Capture thoughts with automatic embedding and metadata extraction
- Search thoughts semantically via vector similarity
- Organise thoughts into projects and retrieve full project context
- Summarise and recall memory across topics and time windows
- Link related thoughts and traverse relationships
Stack
- Go — MCP server over Streamable HTTP
- Postgres + pgvector — storage and vector search
- LiteLLM — primary hosted AI provider (embeddings + metadata extraction)
- OpenRouter — default upstream behind LiteLLM
- Ollama — supported local or self-hosted OpenAI-compatible provider
Tools
| Tool | Purpose |
|---|---|
capture_thought |
Store a thought with embedding and metadata |
search_thoughts |
Semantic similarity search |
list_thoughts |
Filter thoughts by type, topic, person, date |
thought_stats |
Counts and top topics/people |
get_thought |
Retrieve a thought by ID |
update_thought |
Patch content or metadata |
delete_thought |
Hard delete |
archive_thought |
Soft delete |
create_project |
Register a named project |
list_projects |
List projects with thought counts |
get_project_context |
Recent + semantic context for a project |
set_active_project |
Set session project scope |
get_active_project |
Get current session project |
summarize_thoughts |
LLM prose summary over a filtered set |
recall_context |
Semantic + recency context block for injection |
link_thoughts |
Create a typed relationship between thoughts |
related_thoughts |
Explicit links + semantic neighbours |
save_file |
Store a base64-encoded image, document, audio file, or other binary and optionally link it to a thought |
load_file |
Retrieve a stored file by ID as base64 plus metadata |
list_files |
Browse stored files by thought, project, or kind |
backfill_embeddings |
Generate missing embeddings for stored thoughts |
reparse_thought_metadata |
Re-extract and normalize metadata for stored thoughts |
Configuration
Config is YAML-driven. Copy configs/config.example.yaml and set:
database.url— Postgres connection stringauth.mode—api_keysoroauth_client_credentialsauth.keys— API keys for MCP access viax-brain-keyorAuthorization: Bearer <key>whenauth.mode=api_keysauth.oauth.clients— client registry whenauth.mode=oauth_client_credentials
OAuth Client Credentials flow (auth.mode=oauth_client_credentials):
-
Obtain a token —
POST /oauth/token(public, no auth required):POST /oauth/token Content-Type: application/x-www-form-urlencoded Authorization: Basic base64(client_id:client_secret) grant_type=client_credentialsReturns:
{"access_token": "...", "token_type": "bearer", "expires_in": 3600} -
Use the token on the MCP endpoint:
Authorization: Bearer <access_token>
Alternatively, pass client_id and client_secret as body parameters instead of Authorization: Basic. Direct Authorization: Basic credential validation on the MCP endpoint is also supported as a fallback (no token required).
ai.litellm.base_urlandai.litellm.api_key— LiteLLM proxyai.ollama.base_urlandai.ollama.api_key— Ollama local or remote server
See llm/plan.md for full architecture and implementation plan.
Backfill
Run backfill_embeddings after switching embedding models or importing thoughts without vectors.
{
"project": "optional-project-name",
"limit": 100,
"include_archived": false,
"older_than_days": 0,
"dry_run": false
}
dry_run: true— report counts without calling the embedding providerlimit— max thoughts per call (default 100)- Embeddings are generated in parallel (4 workers) and upserted; one failure does not abort the run
Metadata Reparse
Run reparse_thought_metadata to fix stale or inconsistent metadata by re-extracting it from thought content.
{
"project": "optional-project-name",
"limit": 100,
"include_archived": false,
"older_than_days": 0,
"dry_run": false
}
dry_run: truescans only and does not call metadata extraction or write updates- If extraction fails for a thought, existing metadata is normalized and written only if it changes
- Metadata reparse runs in parallel (4 workers); one failure does not abort the run
File Storage
Use save_file to persist binary files as base64. Files can optionally be linked to a memory by passing thought_id, which also adds an attachment reference to that thought's metadata.
{
"name": "meeting-notes.pdf",
"media_type": "application/pdf",
"kind": "document",
"thought_id": "optional-thought-uuid",
"content_base64": "<base64-payload>"
}
Load a stored file again with:
{
"id": "stored-file-uuid"
}
List files for a thought or project with:
{
"thought_id": "optional-thought-uuid",
"project": "optional-project-name",
"kind": "optional-image-document-audio-file",
"limit": 20
}
Automatic backfill (optional, config-gated):
backfill:
enabled: true
run_on_startup: true # run once on server start
interval: "15m" # repeat every 15 minutes
batch_size: 20
max_per_run: 100
include_archived: false
Search fallback: when no embeddings exist for the active model in scope, search_thoughts, recall_context, get_project_context, summarize_thoughts, and related_thoughts automatically fall back to Postgres full-text search so results are never silently empty.
Client Setup
Claude Code
# API key auth
claude mcp add --transport http amcs http://localhost:8080/mcp --header "x-brain-key: <key>"
# Bearer token auth
claude mcp add --transport http amcs http://localhost:8080/mcp --header "Authorization: Bearer <token>"
OpenAI Codex
Add to ~/.codex/config.toml:
[[mcp_servers]]
name = "amcs"
url = "http://localhost:8080/mcp"
[mcp_servers.headers]
x-brain-key = "<key>"
OpenCode
# API key auth
opencode mcp add --name amcs --type remote --url http://localhost:8080/mcp --header "x-brain-key=<key>"
# Bearer token auth
opencode mcp add --name amcs --type remote --url http://localhost:8080/mcp --header "Authorization=Bearer <token>"
Or add directly to opencode.json / ~/.config/opencode/config.json:
{
"mcp": {
"amcs": {
"type": "remote",
"url": "http://localhost:8080/mcp",
"headers": {
"x-brain-key": "<key>"
}
}
}
}
Development
Run the SQL migrations against a local database with:
DATABASE_URL=postgres://... make migrate
LLM integration instructions are served at /llm.
Containers
The repo now includes a Dockerfile and Compose files for running the app with Postgres + pgvector.
- Set a real LiteLLM key in your shell:
export AMCS_LITELLM_API_KEY=your-key - Start the stack with your runtime:
docker compose -f docker-compose.yml -f docker-compose.docker.yml up --buildpodman compose -f docker-compose.yml up --build - Call the service on
http://localhost:8080
Notes:
- The app uses
configs/docker.yamlinside the container. - The local
./configsdirectory is mounted into/app/configs, so config edits apply without rebuilding the image. AMCS_LITELLM_BASE_URLoverrides the LiteLLM endpoint, so you can retarget it without editing YAML.AMCS_OLLAMA_BASE_URLoverrides the Ollama endpoint for local or remote servers.- The Compose stack uses a default bridge network named
amcs. - The base Compose file uses
host.containers.internal, which is Podman-friendly. - The Docker override file adds
host-gatewayaliases so Docker can resolve the same host endpoint. - Database migrations
001through005run automatically when the Postgres volume is created for the first time. migrations/006_rls_and_grants.sqlis intentionally skipped during container bootstrap because it contains deployment-specific grants for a role namedamcs_user.
Ollama
Set ai.provider: "ollama" to use a local or self-hosted Ollama server through its OpenAI-compatible API.
Example:
ai:
provider: "ollama"
embeddings:
model: "nomic-embed-text"
dimensions: 768
metadata:
model: "llama3.2"
temperature: 0.1
ollama:
base_url: "http://localhost:11434/v1"
api_key: "ollama"
request_headers: {}
Notes:
- For remote Ollama servers, point
ai.ollama.base_urlat the remote/v1endpoint. - The client always sends Bearer auth; Ollama ignores it locally, so
api_key: "ollama"is a safe default. ai.embeddings.dimensionsmust match the embedding model you actually use, or startup will fail the database vector-dimension check.
