wdevs/amcs

Fork 0

Files

Hein (Warky) 64024193e9 Add README and initial project plan documentation

2026-03-23 21:48:09 +02:00

35 KiB

Raw Blame History

Avalon Memory Crystal Server (amcs)

OB1 in Go — LiteLLM-First Implementation Plan

Based of the Open Brain project. Reference it for detail: https://github.com/NateBJones-Projects/OB1

Objective

Build a Go implementation of the OB1 project with:

LiteLLM as the primary AI provider
OpenRouter as the default upstream behind LiteLLM
config-file-based keys and auth tokens
MCP over Streamable HTTP
Postgres with pgvector
parity with the current OB1 toolset:
- search_thoughts
- list_thoughts
- thought_stats
- capture_thought
extended toolset for memory and project management:
- get_thought
- update_thought
- delete_thought
- archive_thought
- create_project
- list_projects
- get_project_context
- set_active_project
- get_active_project
- summarize_thoughts
- recall_context
- link_thoughts
- related_thoughts

The current OB1 reference implementation is a small MCP server backed by a thoughts table in Postgres, a match_thoughts(...) vector-search function, and OpenRouter calls for embeddings plus metadata extraction.

Why LiteLLM should be the primary provider

LiteLLM is the right primary abstraction because it gives one stable OpenAI-compatible API surface while allowing routing to multiple upstream providers, including OpenRouter. LiteLLM documents OpenAI-compatible proxy endpoints, including /v1/embeddings, and explicitly supports OpenRouter-backed models.

That gives us:

one provider contract in the Go app
centralized key management
easier model swaps
support for multiple upstreams later
simpler production operations

Provider strategy

Primary runtime mode

App -> LiteLLM Proxy -> OpenRouter / other providers

Fallback mode

App -> OpenRouter directly

We will support both, but the codebase will be designed LiteLLM-first.

Scope of the Go service

The Go service will provide:

MCP server over Streamable HTTP
API-key authentication
thought capture
semantic search
thought listing
thought statistics
thought lifecycle management (get, update, delete, archive)
project grouping and context (create, list, get context, active project)
memory summarization and context recall
thought linking and relationship traversal
provider abstraction for embeddings + metadata extraction
config-file-driven startup

Functional requirements

Required v1 features

start from YAML config
connect to Postgres
use pgvector for embeddings
call LiteLLM for:
- embeddings
- metadata extraction
expose MCP tools over HTTP
protect the MCP endpoint with configured API keys
preserve OB1-compatible tool semantics
store thoughts with metadata and embeddings
search via match_thoughts(...)

Deferred features

Slack ingestion
webhook ingestion
async metadata extraction
per-user tenancy
admin UI
background enrichment jobs
multi-provider routing policy inside the app

Reference behavior to preserve

The current OB1 server is small and direct:

it stores thoughts in a thoughts table
it uses vector similarity for semantic search
it exposes four MCP tools
it uses an access key for auth
it generates embeddings and extracts metadata via OpenRouter.

The Go version should preserve those behaviors first, then improve structure and operability.

Architecture

                +----------------------+
                | MCP Client / AI App  |
                +----------+-----------+
                           |
                           | Streamable HTTP
                           v
                +----------------------+
                | Go OB1 Server        |
                | auth + MCP tools     |
                +----+-----------+-----+
                     |           |
                     |           |
                     v           v
          +----------------+   +----------------------+
          | LiteLLM Proxy  |   | Postgres + pgvector  |
          | embeddings     |   | thoughts + pgvector  |
          | metadata       |   | RPC/search SQL       |
          +--------+-------+   +----------------------+
                   |
                   v
            +-------------+
            | OpenRouter  |
            | or others   |
            +-------------+

High-level design

Core components

Config subsystem
- load YAML
- apply env overrides
- validate required fields
Auth subsystem
- API-key validation
- header-based auth
- optional query-param auth
AI provider subsystem
- provider interface
- LiteLLM implementation
- optional OpenRouter direct implementation
Store subsystem
- Postgres connection pool
- insert/search/list/stats operations
- pgvector support
MCP subsystem
- MCP server
- tool registration
- HTTP transport
Observability subsystem
- structured logs
- metrics
- health checks

Project layout

ob1-go/
  cmd/
    ob1-server/
      main.go

  internal/
    app/
      app.go

    config/
      config.go
      loader.go
      validate.go

    auth/
      middleware.go
      keyring.go

    ai/
      provider.go
      factory.go
      prompts.go
      types.go

      litellm/
        client.go
        embeddings.go
        metadata.go

      openrouter/
        client.go
        embeddings.go
        metadata.go

    mcpserver/
      server.go
      transport.go

    tools/
      search.go
      list.go
      stats.go
      capture.go
      get.go
      update.go
      delete.go
      archive.go
      projects.go
      context.go
      summarize.go
      recall.go
      links.go

    store/
      db.go
      thoughts.go
      stats.go
      projects.go
      links.go

    metadata/
      schema.go
      normalize.go
      validate.go

    types/
      thought.go
      filters.go

    observability/
      logger.go
      metrics.go
      tracing.go

  migrations/
    001_enable_vector.sql
    002_create_thoughts.sql
    003_add_projects.sql
    004_create_thought_links.sql
    005_create_match_thoughts.sql
    006_rls_and_grants.sql

  configs/
    config.example.yaml
    dev.yaml

  scripts/
    run-local.sh
    migrate.sh

  go.mod
  README.md

Dependencies

Required Go packages

github.com/modelcontextprotocol/go-sdk
github.com/jackc/pgx/v5
github.com/pgvector/pgvector-go
gopkg.in/yaml.v3
github.com/go-playground/validator/v10
github.com/google/uuid

Standard library usage

net/http
context
log/slog
time
encoding/json

The Go MCP SDK is the right fit for implementing an MCP server in Go, and pgvector-go is the expected library for Go integration with pgvector-backed Postgres columns.

Config model

Config files are the primary source of truth.

Rules

use YAML config files
allow environment overrides
do not commit real secrets
commit only config.example.yaml
keep local secrets in ignored files
in production, mount config files as secrets or use env overrides for sensitive values

Example config

server:
  host: "0.0.0.0"
  port: 8080
  read_timeout: "15s"
  write_timeout: "30s"
  idle_timeout: "60s"
  allowed_origins:
    - "*"

mcp:
  path: "/mcp"
  server_name: "open-brain"
  version: "1.0.0"
  transport: "streamable_http"

auth:
  mode: "api_keys"
  header_name: "x-brain-key"
  query_param: "key"
  allow_query_param: false
  keys:
    - id: "local-client"
      value: "replace-me"
      description: "main local client key"

database:
  url: "postgres://user:pass@localhost:5432/ob1?sslmode=disable"
  max_conns: 10
  min_conns: 2
  max_conn_lifetime: "30m"
  max_conn_idle_time: "10m"

ai:
  provider: "litellm"

  embeddings:
    model: "openai/text-embedding-3-small"
    dimensions: 1536

  metadata:
    model: "gpt-4o-mini"
    temperature: 0.1

  litellm:
    base_url: "http://localhost:4000/v1"
    api_key: "replace-me"
    use_responses_api: false
    request_headers: {}
    embedding_model: "openrouter/openai/text-embedding-3-small"
    metadata_model: "gpt-4o-mini"

  openrouter:
    base_url: "https://openrouter.ai/api/v1"
    api_key: ""
    app_name: "ob1-go"
    site_url: ""
    extra_headers: {}

capture:
  source: "mcp"
  metadata_defaults:
    type: "observation"
    topic_fallback: "uncategorized"

search:
  default_limit: 10
  default_threshold: 0.5
  max_limit: 50

logging:
  level: "info"
  format: "json"

observability:
  metrics_enabled: true
  pprof_enabled: false

Config structs

type Config struct {
    Server        ServerConfig        `yaml:"server"`
    MCP           MCPConfig           `yaml:"mcp"`
    Auth          AuthConfig          `yaml:"auth"`
    Database      DatabaseConfig      `yaml:"database"`
    AI            AIConfig            `yaml:"ai"`
    Capture       CaptureConfig       `yaml:"capture"`
    Search        SearchConfig        `yaml:"search"`
    Logging       LoggingConfig       `yaml:"logging"`
    Observability ObservabilityConfig `yaml:"observability"`
}

type ServerConfig struct {
    Host           string        `yaml:"host"`
    Port           int           `yaml:"port"`
    ReadTimeout    time.Duration `yaml:"read_timeout"`
    WriteTimeout   time.Duration `yaml:"write_timeout"`
    IdleTimeout    time.Duration `yaml:"idle_timeout"`
    AllowedOrigins []string      `yaml:"allowed_origins"`
}

type MCPConfig struct {
    Path       string `yaml:"path"`
    ServerName string `yaml:"server_name"`
    Version    string `yaml:"version"`
    Transport  string `yaml:"transport"`
}

type AuthConfig struct {
    Mode            string   `yaml:"mode"`
    HeaderName      string   `yaml:"header_name"`
    QueryParam      string   `yaml:"query_param"`
    AllowQueryParam bool     `yaml:"allow_query_param"`
    Keys            []APIKey `yaml:"keys"`
}

type APIKey struct {
    ID          string `yaml:"id"`
    Value       string `yaml:"value"`
    Description string `yaml:"description"`
}

type DatabaseConfig struct {
    URL             string        `yaml:"url"`
    MaxConns        int32         `yaml:"max_conns"`
    MinConns        int32         `yaml:"min_conns"`
    MaxConnLifetime time.Duration `yaml:"max_conn_lifetime"`
    MaxConnIdleTime time.Duration `yaml:"max_conn_idle_time"`
}

type AIConfig struct {
    Provider   string              `yaml:"provider"` // litellm | openrouter
    Embeddings AIEmbeddingConfig   `yaml:"embeddings"`
    Metadata   AIMetadataConfig    `yaml:"metadata"`
    LiteLLM    LiteLLMConfig       `yaml:"litellm"`
    OpenRouter OpenRouterAIConfig  `yaml:"openrouter"`
}

type AIEmbeddingConfig struct {
    Model      string `yaml:"model"`
    Dimensions int    `yaml:"dimensions"`
}

type AIMetadataConfig struct {
    Model       string  `yaml:"model"`
    Temperature float64 `yaml:"temperature"`
}

type LiteLLMConfig struct {
    BaseURL         string            `yaml:"base_url"`
    APIKey          string            `yaml:"api_key"`
    UseResponsesAPI bool              `yaml:"use_responses_api"`
    RequestHeaders  map[string]string `yaml:"request_headers"`
    EmbeddingModel  string            `yaml:"embedding_model"`
    MetadataModel   string            `yaml:"metadata_model"`
}

type OpenRouterAIConfig struct {
    BaseURL      string            `yaml:"base_url"`
    APIKey       string            `yaml:"api_key"`
    AppName      string            `yaml:"app_name"`
    SiteURL      string            `yaml:"site_url"`
    ExtraHeaders map[string]string `yaml:"extra_headers"`
}

Config precedence

Order

--config /path/to/file.yaml
OB1_CONFIG
default ./configs/dev.yaml
environment overrides for specific fields

Suggested env overrides

OB1_DATABASE_URL
OB1_LITELLM_API_KEY
OB1_OPENROUTER_API_KEY
OB1_SERVER_PORT

Validation rules

At startup, fail fast if:

database.url is empty
auth.keys is empty
mcp.path is empty
ai.provider is unsupported
ai.embeddings.dimensions <= 0
provider-specific base URL or API key is missing
the DB vector dimension does not match configured embedding dimensions

AI provider design

Provider interface

type Provider interface {
    Embed(ctx context.Context, input string) ([]float32, error)
    ExtractMetadata(ctx context.Context, input string) (ThoughtMetadata, error)
    Name() string
}

Factory

func NewProvider(cfg AIConfig, httpClient *http.Client, log *slog.Logger) (Provider, error) {
    switch cfg.Provider {
    case "litellm":
        return litellm.New(cfg, httpClient, log)
    case "openrouter":
        return openrouter.New(cfg, httpClient, log)
    default:
        return nil, fmt.Errorf("unsupported ai.provider: %s", cfg.Provider)
    }
}

LiteLLM-first behavior

Embeddings

The app will call LiteLLM at:

POST /v1/embeddings

using an OpenAI-compatible request payload and Bearer auth. LiteLLM documents its proxy embeddings support through OpenAI-compatible endpoints.

Metadata extraction

The app will call LiteLLM at:

POST /v1/chat/completions

using:

configured metadata model
system prompt
user message
JSON-oriented response handling

LiteLLM’s proxy is intended to accept OpenAI-style chat completion requests.

Model routing

In config, use:

litellm.embedding_model
litellm.metadata_model

These may be:

direct model names
LiteLLM aliases
OpenRouter-backed model identifiers

Example:

litellm:
  embedding_model: "openrouter/openai/text-embedding-3-small"
  metadata_model: "gpt-4o-mini"

LiteLLM documents OpenRouter provider usage and OpenRouter-backed model naming.

OpenRouter fallback mode

If ai.provider: openrouter, the app will directly call:

POST /api/v1/embeddings
POST /api/v1/chat/completions

with Bearer auth.

OpenRouter documents the embeddings endpoint and its authentication model.

This mode is mainly for:

local simplicity
debugging provider issues
deployments without LiteLLM

Metadata schema

Use one stable metadata schema regardless of provider.

type ThoughtMetadata struct {
    People         []string `json:"people"`
    ActionItems    []string `json:"action_items"`
    DatesMentioned []string `json:"dates_mentioned"`
    Topics         []string `json:"topics"`
    Type           string   `json:"type"`
    Source         string   `json:"source"`
}

Accepted type values

observation
task
idea
reference
person_note

Normalization rules

trim all strings
deduplicate arrays
drop empty values
cap topics count if needed
default invalid type to observation
set source: "mcp" for MCP-captured thoughts

Fallback defaults

If metadata extraction fails:

{
  "people": [],
  "action_items": [],
  "dates_mentioned": [],
  "topics": ["uncategorized"],
  "type": "observation",
  "source": "mcp"
}

Database design

The DB contract should match the current OB1 structure as closely as possible:

thoughts table
embedding vector(1536)
HNSW index
metadata JSONB
match_thoughts(...) function

Migrations

`001_enable_vector.sql`

create extension if not exists vector;

`002_create_thoughts.sql`

create table if not exists thoughts (
  id uuid default gen_random_uuid() primary key,
  content text not null,
  embedding vector(1536),
  metadata jsonb default '{}'::jsonb,
  created_at timestamptz default now(),
  updated_at timestamptz default now()
);

create index if not exists thoughts_embedding_hnsw_idx
  on thoughts using hnsw (embedding vector_cosine_ops);

create index if not exists thoughts_metadata_gin_idx
  on thoughts using gin (metadata);

create index if not exists thoughts_created_at_idx
  on thoughts (created_at desc);

`003_add_projects.sql`

create table if not exists projects (
  id          uuid default gen_random_uuid() primary key,
  name        text not null unique,
  description text,
  created_at  timestamptz default now(),
  last_active_at timestamptz default now()
);

alter table thoughts add column if not exists project_id uuid references projects(id);
alter table thoughts add column if not exists archived_at timestamptz;

create index if not exists thoughts_project_id_idx on thoughts (project_id);
create index if not exists thoughts_archived_at_idx on thoughts (archived_at);

`004_create_thought_links.sql`

create table if not exists thought_links (
  from_id    uuid references thoughts(id) on delete cascade,
  to_id      uuid references thoughts(id) on delete cascade,
  relation   text not null,
  created_at timestamptz default now(),
  primary key (from_id, to_id, relation)
);

create index if not exists thought_links_from_idx on thought_links (from_id);
create index if not exists thought_links_to_idx   on thought_links (to_id);

`005_create_match_thoughts.sql`

create or replace function match_thoughts(
  query_embedding vector(1536),
  match_threshold float default 0.7,
  match_count int default 10,
  filter jsonb default '{}'::jsonb
)
returns table (
  id uuid,
  content text,
  metadata jsonb,
  similarity float,
  created_at timestamptz
)
language plpgsql
as $$
begin
  return query
  select
    t.id,
    t.content,
    t.metadata,
    1 - (t.embedding <=> query_embedding) as similarity,
    t.created_at
  from thoughts t
  where 1 - (t.embedding <=> query_embedding) > match_threshold
    and (filter = '{}'::jsonb or t.metadata @> filter)
  order by t.embedding <=> query_embedding
  limit match_count;
end;
$$;

`006_rls_and_grants.sql`

-- Grant full access to the application database user configured in database.url.
-- Replace 'ob1_user' with the actual role name used in your database.url.
grant select, insert, update, delete on table public.thoughts to ob1_user;
grant select, insert, update, delete on table public.projects to ob1_user;
grant select, insert, update, delete on table public.thought_links to ob1_user;

Store layer

Interfaces

type ThoughtStore interface {
    InsertThought(ctx context.Context, thought Thought) error
    GetThought(ctx context.Context, id uuid.UUID) (Thought, error)
    UpdateThought(ctx context.Context, id uuid.UUID, patch ThoughtPatch) (Thought, error)
    DeleteThought(ctx context.Context, id uuid.UUID) error
    ArchiveThought(ctx context.Context, id uuid.UUID) error
    SearchThoughts(ctx context.Context, embedding []float32, threshold float64, limit int, filter map[string]any) ([]SearchResult, error)
    ListThoughts(ctx context.Context, filter ListFilter) ([]Thought, error)
    Stats(ctx context.Context) (ThoughtStats, error)
}

type ProjectStore interface {
    InsertProject(ctx context.Context, project Project) error
    GetProject(ctx context.Context, nameOrID string) (Project, error)
    ListProjects(ctx context.Context) ([]ProjectSummary, error)
    TouchProject(ctx context.Context, id uuid.UUID) error
}

type LinkStore interface {
    InsertLink(ctx context.Context, link ThoughtLink) error
    GetLinks(ctx context.Context, thoughtID uuid.UUID) ([]ThoughtLink, error)
}

DB implementation notes

Use pgxpool.Pool.

On startup:

parse DB config
create pool
register pgvector support
ping DB
verify required function exists
verify vector extension exists

Domain types

type Thought struct {
    ID        uuid.UUID
    Content   string
    Embedding []float32
    Metadata  ThoughtMetadata
    ProjectID *uuid.UUID
    ArchivedAt *time.Time
    CreatedAt time.Time
    UpdatedAt time.Time
}

type SearchResult struct {
    ID         uuid.UUID
    Content    string
    Metadata   ThoughtMetadata
    Similarity float64
    CreatedAt  time.Time
}

type ListFilter struct {
    Limit           int
    Type            string
    Topic           string
    Person          string
    Days            int
    ProjectID       *uuid.UUID
    IncludeArchived bool
}

type ThoughtStats struct {
    TotalCount int
    TypeCounts map[string]int
    TopTopics  []KeyCount
    TopPeople  []KeyCount
}

type ThoughtPatch struct {
    Content  *string
    Metadata *ThoughtMetadata
}

type Project struct {
    ID           uuid.UUID
    Name         string
    Description  string
    CreatedAt    time.Time
    LastActiveAt time.Time
}

type ProjectSummary struct {
    Project
    ThoughtCount int
}

type ThoughtLink struct {
    FromID    uuid.UUID
    ToID      uuid.UUID
    Relation  string
    CreatedAt time.Time
}

Auth design

The reference OB1 implementation uses a configured access key and accepts it via header or query param.

We will keep compatibility but make it cleaner.

Auth behavior

primary auth via header, default: x-brain-key
optional query param fallback
support multiple keys in config
attach key ID to request context for auditing

Middleware flow

read configured header
if missing and allowed, read query param
compare against in-memory keyring
if matched, attach key ID to request context
else return 401 Unauthorized

Recommendation

Set:

auth:
  allow_query_param: false

for production.

MCP server design

Expose MCP over Streamable HTTP. MCP’s spec defines Streamable HTTP as the remote transport replacing the older HTTP+SSE approach.

HTTP routes

POST /mcp
GET /healthz
GET /readyz

Middleware stack

request ID
panic recovery
structured logging
auth
timeout
optional CORS

MCP tools

1. `capture_thought`

Input

content string

Flow

validate content
concurrently:
- call provider Embed
- call provider ExtractMetadata
normalize metadata
set source = "mcp"
insert into thoughts
return success payload

2. `search_thoughts`

Input

query string
limit int
threshold float

Flow

embed query
call match_thoughts(...)
format ranked results
return results

3. `list_thoughts`

Input

limit
type
topic
person
days

Flow

build SQL filters
query thoughts
order by created_at desc
return summaries

4. `thought_stats`

Input

none

Flow

count rows
aggregate metadata usage
return totals and top buckets

5. `get_thought`

Input

id string

Flow

validate UUID
query thoughts by ID
return full record or not-found error

6. `update_thought`

Input

id string
content string (optional)
metadata map (optional, merged not replaced)

Flow

validate inputs
if content provided: re-embed and re-extract metadata
merge metadata patch
update row, set updated_at
return updated record

7. `delete_thought`

Input

id string

Flow

validate UUID
hard-delete row (cascades to thought_links)
return confirmation

8. `archive_thought`

Input

id string

Flow

validate UUID
set archived_at = now()
return confirmation

Note: archived thoughts are excluded from search and list results by default unless include_archived: true is passed.

9. `create_project`

Input

name string
description string (optional)

Flow

validate name uniqueness
insert into projects
return project record

10. `list_projects`

Input

none

Flow

query projects ordered by last_active_at desc
join thought counts per project
return summaries

11. `get_project_context`

Input

project string (name or ID)
query string (optional, semantic focus)
limit int

Flow

resolve project
fetch recent thoughts in project (last N)
if query provided: semantic search scoped to project
merge and deduplicate results ranked by recency + similarity
update projects.last_active_at
return context block ready for injection

12. `set_active_project`

Input

project string (name or ID)

Flow

resolve project
store project ID in server session context (in-memory, per connection)
return confirmation

13. `get_active_project`

Input

none

Flow

return current session active project or null

14. `summarize_thoughts`

Input

query string (optional topic focus)
project string (optional)
days int (optional time window)
limit int

Flow

fetch matching thoughts via search or filter
format as context
call AI provider to produce prose summary
return summary text

15. `recall_context`

Input

query string
project string (optional)
limit int

Flow

semantic search with optional project filter
recency boost: merge with most recent N thoughts from project
deduplicate and rank
return formatted context block suitable for pasting into a new conversation

16. `link_thoughts`

Input

from_id string
to_id string
relation string (e.g. follows_up, contradicts, references, blocks)

Flow

validate both IDs exist
insert into thought_links
return confirmation

17. `related_thoughts`

Input

id string
include_semantic bool (default true)

Flow

fetch explicit links from thought_links for this ID
if include_semantic: also fetch nearest semantic neighbours
merge, deduplicate, return with relation type or similarity score

Tool package plan

`internal/tools/capture.go`

Responsibilities:

input validation
parallel embed + metadata extraction
normalization
write to store

`internal/tools/search.go`

Responsibilities:

input validation
embed query
vector search
output formatting

`internal/tools/list.go`

Responsibilities:

filter normalization
DB read
output formatting

`internal/tools/stats.go`

Responsibilities:

fetch/aggregate stats
output shaping

`internal/tools/get.go`

Responsibilities:

UUID validation
single thought retrieval

`internal/tools/update.go`

Responsibilities:

partial content/metadata update
conditional re-embed if content changed
metadata merge

`internal/tools/delete.go`

Responsibilities:

UUID validation
hard delete

`internal/tools/archive.go`

Responsibilities:

UUID validation
set archived_at

`internal/tools/projects.go`

Responsibilities:

create_project, list_projects
set_active_project, get_active_project (session context)

`internal/tools/context.go`

Responsibilities:

get_project_context: resolve project, combine recency + semantic search, return context block
update last_active_at on access

`internal/tools/summarize.go`

Responsibilities:

filter/search thoughts
format as prompt context
call AI provider for prose summary

`internal/tools/recall.go`

Responsibilities:

recall_context: semantic search + recency boost + project filter
output formatted context block

`internal/tools/links.go`

Responsibilities:

link_thoughts: validate both IDs, insert link
related_thoughts: fetch explicit links + optional semantic neighbours, merge and return

Startup sequence

parse CLI args
load config file
apply env overrides
validate config
initialize logger
create DB pool
verify DB requirements
create AI provider
create store
create tool handlers
register MCP tools
start HTTP server

Error handling policy

Fail fast on startup errors

invalid config
DB unavailable
missing required API keys
invalid MCP config
provider initialization failure

Retry policy for provider calls

Retry on:

429
500
502
503
timeout
connection reset

Do not retry on:

malformed request
auth failure
invalid model name
invalid response shape after repeated attempts

Use:

exponential backoff
capped retries
context-aware cancellation

Observability

Logging

Use log/slog in JSON mode.

Include:

request ID
route
tool name
key ID
provider name
DB latency
upstream latency
error class

Metrics

Track:

request count by tool
request duration
provider call duration
DB query duration
auth failures
provider failures
insert/search counts

Health checks

`/healthz`

Returns OK if process is running.

`/readyz`

Returns OK only if:

DB is reachable
provider config is valid
optional provider probe passes

Security plan

Secrets handling

keep secrets in config files only for local/dev use
never commit real secrets
use mounted secret files or env overrides in production

API key policy

support multiple keys
identify keys by ID
allow key rotation by config update + restart
log only key ID, never raw value

Transport

run behind TLS terminator in production
disable query-param auth in production
avoid logging full URLs when query-param auth is enabled

Testing plan

Unit tests

Config

valid config loads
invalid config fails
env overrides apply correctly

Auth

header auth success
query auth success
invalid key rejected
disabled query auth rejected

Metadata

normalization works
invalid types default correctly
empty metadata falls back safely

AI provider parsing

LiteLLM embeddings parse correctly
LiteLLM chat completions parse correctly
provider errors classified correctly

Store

filter builders generate expected SQL fragments
JSONB metadata handling is stable

Integration tests

Run against local Postgres with pgvector.

Test:

migrations apply cleanly
insert thought
search thought
list thoughts with filters
stats aggregation
auth-protected MCP route
LiteLLM mock/proxy compatibility

Manual acceptance tests

start local Postgres + pgvector
start LiteLLM
configure LiteLLM to route embeddings to OpenRouter
start Go server
connect MCP client
call capture_thought
call search_thoughts
call list_thoughts
call thought_stats
rotate API key and verify restart behavior

Milestones

Milestone 1 — foundation

Deliver:

repo skeleton
config loader
config validation
logger
DB connection
migrations

Exit criteria:

app starts
DB connection verified
config-driven startup works

Milestone 2 — AI provider layer

Deliver:

provider interface
LiteLLM implementation
OpenRouter fallback implementation
metadata prompt
normalization

Exit criteria:

successful embedding call through LiteLLM
successful metadata extraction through LiteLLM
vector length validation works

Milestone 3 — capture and search

Deliver:

capture_thought
search_thoughts
store methods for insert and vector search

Exit criteria:

thoughts can be captured end-to-end
semantic search returns results

Milestone 4 — remaining tools

Deliver:

list_thoughts
thought_stats

Exit criteria:

all four tools function through MCP

Milestone 5 — extended memory and project tools

Deliver:

get_thought, update_thought, delete_thought, archive_thought
create_project, list_projects, set_active_project, get_active_project
get_project_context, recall_context
summarize_thoughts
link_thoughts, related_thoughts
migrations 003 and 004 (projects + links tables)

Exit criteria:

thoughts can be retrieved, patched, deleted, archived
projects can be created and listed
get_project_context returns a usable context block
summarize_thoughts produces a prose summary via the AI provider
thought links can be created and retrieved with semantic neighbours

Milestone 6 — HTTP and auth hardening

Deliver:

auth middleware
health endpoints
structured logs
retries
timeouts

Exit criteria:

endpoint protected
logs useful
service stable under expected failures

Milestone 7 — production readiness

Deliver:

metrics
readiness checks
key rotation workflow
deployment docs

Exit criteria:

production deployment is straightforward
operational playbook exists

Implementation order

Build in this order:

config
DB + migrations (001, 002, 003, 004)
LiteLLM client
metadata normalization
capture_thought
search_thoughts
MCP HTTP server
auth middleware
list_thoughts
thought_stats
get_thought, update_thought, delete_thought, archive_thought
create_project, list_projects, set_active_project, get_active_project
get_project_context, recall_context
summarize_thoughts
link_thoughts, related_thoughts
logs/metrics/health

This gives usable value early and builds the project/memory layer on a solid foundation.

Recommended local development stack

Services

Postgres with pgvector
LiteLLM proxy
optional OpenRouter upstream
Go service

Example shape

docker-compose:
  postgres
  litellm
  ob1-go

Recommended production deployment

Preferred architecture

Go service on Fly.io / Cloud Run / Render
LiteLLM as separate service
Postgres managed externally
TLS terminator in front

Why not Edge Functions

The original repo uses Deno Edge Functions because of its chosen deployment environment, but the app behavior is better suited to a normal long-running Go service for maintainability and observability.

Risks and decisions

Risk: embedding dimension mismatch

Mitigation:

validate config vs DB on startup

Risk: LiteLLM model alias drift

Mitigation:

add readiness probe for configured models

Risk: metadata extraction instability

Mitigation:

strong normalization + safe defaults

Risk: single global auth model

Mitigation:

acceptable for v1
redesign for multi-tenant later

Risk: stats scaling poorly

Mitigation:

start with in-memory aggregation
move to SQL aggregation if needed

Definition of done for v1

The project is done when:

service starts from YAML config
LiteLLM is the primary AI provider
OpenRouter can be used behind LiteLLM
direct OpenRouter mode still works
MCP endpoint is authenticated
capture_thought stores content, embedding, metadata
search_thoughts performs semantic search
list_thoughts supports filtering
thought_stats returns useful summaries
thoughts can be retrieved, updated, deleted, and archived
projects can be created, listed, and used to scope captures and searches
get_project_context returns a ready-to-use context block
recall_context returns a semantically relevant + recent context block
summarize_thoughts produces prose summaries via the AI provider
thought links can be created and traversed with semantic fallback
logs and health checks exist
key rotation works via config + restart

Recommendation

Build this as a boring Go service:

stdlib HTTP
thin MCP server layer
thin provider layer
thin store layer
YAML config
explicit interfaces

Do not over-abstract it. The product shape is simple. The goal is a reliable, understandable service with LiteLLM as the stable provider boundary.

Next implementation artifact

The next concrete deliverable should be a starter Go repo skeleton containing:

go.mod
folder structure
main.go
config loader
example config
migration files (001–006)
provider interface
LiteLLM client skeleton
store interfaces (ThoughtStore, ProjectStore, LinkStore)
domain types including Project, ThoughtLink, ThoughtPatch
MCP tool registration stubs for all 17 tools

35 KiB Raw Blame History Unescape Escape

Avalon Memory Crystal Server (amcs)

OB1 in Go — LiteLLM-First Implementation Plan

Objective

Why LiteLLM should be the primary provider

Provider strategy

Scope of the Go service

Functional requirements

Required v1 features

Deferred features

Reference behavior to preserve

Architecture

High-level design

Core components

Project layout

Dependencies

Required Go packages

Standard library usage

Config model

Rules

Example config

Config structs

Config precedence

Order

Suggested env overrides

Validation rules

AI provider design

Provider interface

Factory

LiteLLM-first behavior

Embeddings

Metadata extraction

Model routing

OpenRouter fallback mode

Metadata schema

Accepted type values

Normalization rules

Fallback defaults

Database design

Migrations

001_enable_vector.sql

002_create_thoughts.sql

003_add_projects.sql

004_create_thought_links.sql

005_create_match_thoughts.sql

006_rls_and_grants.sql

Store layer

Interfaces

DB implementation notes

Domain types

Auth design

Auth behavior

Middleware flow

Recommendation

MCP server design

HTTP routes

Middleware stack

MCP tools

1. capture_thought

2. search_thoughts

3. list_thoughts

4. thought_stats

5. get_thought

6. update_thought

7. delete_thought

8. archive_thought

9. create_project

10. list_projects

11. get_project_context

12. set_active_project

13. get_active_project

14. summarize_thoughts

15. recall_context

16. link_thoughts

17. related_thoughts

Tool package plan

internal/tools/capture.go

internal/tools/search.go

internal/tools/list.go

internal/tools/stats.go

35 KiB

Raw Blame History

`001_enable_vector.sql`

`002_create_thoughts.sql`

`003_add_projects.sql`

`004_create_thought_links.sql`

`005_create_match_thoughts.sql`

`006_rls_and_grants.sql`

1. `capture_thought`

2. `search_thoughts`

3. `list_thoughts`

4. `thought_stats`

5. `get_thought`

6. `update_thought`

7. `delete_thought`

8. `archive_thought`

9. `create_project`

10. `list_projects`

11. `get_project_context`

12. `set_active_project`

13. `get_active_project`

14. `summarize_thoughts`

15. `recall_context`

16. `link_thoughts`

17. `related_thoughts`

`internal/tools/capture.go`

`internal/tools/search.go`

`internal/tools/list.go`

`internal/tools/stats.go`

`internal/tools/get.go`

`internal/tools/update.go`

`internal/tools/delete.go`

`internal/tools/archive.go`

`internal/tools/projects.go`

`internal/tools/context.go`

`internal/tools/summarize.go`

`internal/tools/recall.go`

`internal/tools/links.go`

`/healthz`

`/readyz`