diff --git a/README.md b/README.md index 290370e..e8f488c 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,52 @@ -# amcs +# Avalon Memory Crystal Server (amcs) -Avalon Memory Crystal Server \ No newline at end of file +![Avalon Memory Crystal](assets/avelonmemorycrystal.jpg) + +A Go MCP server for capturing and retrieving thoughts, memory, and project context. Exposes tools over Streamable HTTP, backed by Postgres with pgvector for semantic search. + +## What it does + +- **Capture** thoughts with automatic embedding and metadata extraction +- **Search** thoughts semantically via vector similarity +- **Organise** thoughts into projects and retrieve full project context +- **Summarise** and recall memory across topics and time windows +- **Link** related thoughts and traverse relationships + +## Stack + +- Go — MCP server over Streamable HTTP +- Postgres + pgvector — storage and vector search +- LiteLLM — primary AI provider (embeddings + metadata extraction) +- OpenRouter — default upstream behind LiteLLM + +## Tools + +| Tool | Purpose | +|---|---| +| `capture_thought` | Store a thought with embedding and metadata | +| `search_thoughts` | Semantic similarity search | +| `list_thoughts` | Filter thoughts by type, topic, person, date | +| `thought_stats` | Counts and top topics/people | +| `get_thought` | Retrieve a thought by ID | +| `update_thought` | Patch content or metadata | +| `delete_thought` | Hard delete | +| `archive_thought` | Soft delete | +| `create_project` | Register a named project | +| `list_projects` | List projects with thought counts | +| `get_project_context` | Recent + semantic context for a project | +| `set_active_project` | Set session project scope | +| `get_active_project` | Get current session project | +| `summarize_thoughts` | LLM prose summary over a filtered set | +| `recall_context` | Semantic + recency context block for injection | +| `link_thoughts` | Create a typed relationship between thoughts | +| `related_thoughts` | Explicit links + semantic neighbours | + +## Configuration + +Config is YAML-driven. Copy `configs/config.example.yaml` and set: + +- `database.url` — Postgres connection string +- `auth.keys` — API keys for MCP endpoint access +- `ai.litellm.base_url` and `ai.litellm.api_key` — LiteLLM proxy + +See `llm/plan.md` for full architecture and implementation plan. diff --git a/assets/avelonmemorycrystal.jpg b/assets/avelonmemorycrystal.jpg new file mode 100644 index 0000000..f178f6a Binary files /dev/null and b/assets/avelonmemorycrystal.jpg differ diff --git a/llm/plan.md b/llm/plan.md new file mode 100644 index 0000000..8f8c942 --- /dev/null +++ b/llm/plan.md @@ -0,0 +1,1809 @@ +# Avalon Memory Crystal Server (amcs) +## OB1 in Go — LiteLLM-First Implementation Plan +Based of the Open Brain project. Reference it for detail: https://github.com/NateBJones-Projects/OB1 + +## Objective + +Build a Go implementation of the OB1 project with: + +* **LiteLLM as the primary AI provider** +* **OpenRouter as the default upstream behind LiteLLM** +* **config-file-based keys and auth tokens** +* **MCP over Streamable HTTP** +* **Postgres with pgvector** +* parity with the current OB1 toolset: + + * `search_thoughts` + * `list_thoughts` + * `thought_stats` + * `capture_thought` + +* extended toolset for memory and project management: + + * `get_thought` + * `update_thought` + * `delete_thought` + * `archive_thought` + * `create_project` + * `list_projects` + * `get_project_context` + * `set_active_project` + * `get_active_project` + * `summarize_thoughts` + * `recall_context` + * `link_thoughts` + * `related_thoughts` + +The current OB1 reference implementation is a small MCP server backed by a `thoughts` table in Postgres, a `match_thoughts(...)` vector-search function, and OpenRouter calls for embeddings plus metadata extraction. + +--- + +## Why LiteLLM should be the primary provider + +LiteLLM is the right primary abstraction because it gives one stable OpenAI-compatible API surface while allowing routing to multiple upstream providers, including OpenRouter. LiteLLM documents OpenAI-compatible proxy endpoints, including `/v1/embeddings`, and explicitly supports OpenRouter-backed models. + +That gives us: + +* one provider contract in the Go app +* centralized key management +* easier model swaps +* support for multiple upstreams later +* simpler production operations + +### Provider strategy + +**Primary runtime mode** + +* App -> LiteLLM Proxy -> OpenRouter / other providers + +**Fallback mode** + +* App -> OpenRouter directly + +We will support both, but the codebase will be designed **LiteLLM-first**. + +--- + +## Scope of the Go service + +The Go service will provide: + +1. **MCP server over Streamable HTTP** +2. **API-key authentication** +3. **thought capture** +4. **semantic search** +5. **thought listing** +6. **thought statistics** +7. **thought lifecycle management** (get, update, delete, archive) +8. **project grouping and context** (create, list, get context, active project) +9. **memory summarization and context recall** +10. **thought linking and relationship traversal** +11. **provider abstraction for embeddings + metadata extraction** +12. **config-file-driven startup** + +--- + +## Functional requirements + +### Required v1 features + +* start from YAML config +* connect to Postgres +* use pgvector for embeddings +* call LiteLLM for: + + * embeddings + * metadata extraction +* expose MCP tools over HTTP +* protect the MCP endpoint with configured API keys +* preserve OB1-compatible tool semantics +* store thoughts with metadata and embeddings +* search via `match_thoughts(...)` + +### Deferred features + +* Slack ingestion +* webhook ingestion +* async metadata extraction +* per-user tenancy +* admin UI +* background enrichment jobs +* multi-provider routing policy inside the app + +--- + +## Reference behavior to preserve + +The current OB1 server is small and direct: + +* it stores thoughts in a `thoughts` table +* it uses vector similarity for semantic search +* it exposes four MCP tools +* it uses an access key for auth +* it generates embeddings and extracts metadata via OpenRouter. + +The Go version should preserve those behaviors first, then improve structure and operability. + +--- + +## Architecture + +```text + +----------------------+ + | MCP Client / AI App | + +----------+-----------+ + | + | Streamable HTTP + v + +----------------------+ + | Go OB1 Server | + | auth + MCP tools | + +----+-----------+-----+ + | | + | | + v v + +----------------+ +----------------------+ + | LiteLLM Proxy | | Postgres + pgvector | + | embeddings | | thoughts + pgvector | + | metadata | | RPC/search SQL | + +--------+-------+ +----------------------+ + | + v + +-------------+ + | OpenRouter | + | or others | + +-------------+ +``` + +--- + +## High-level design + +### Core components + +1. **Config subsystem** + + * load YAML + * apply env overrides + * validate required fields + +2. **Auth subsystem** + + * API-key validation + * header-based auth + * optional query-param auth + +3. **AI provider subsystem** + + * provider interface + * LiteLLM implementation + * optional OpenRouter direct implementation + +4. **Store subsystem** + + * Postgres connection pool + * insert/search/list/stats operations + * pgvector support + +5. **MCP subsystem** + + * MCP server + * tool registration + * HTTP transport + +6. **Observability subsystem** + + * structured logs + * metrics + * health checks + +--- + +## Project layout + +```text +ob1-go/ + cmd/ + ob1-server/ + main.go + + internal/ + app/ + app.go + + config/ + config.go + loader.go + validate.go + + auth/ + middleware.go + keyring.go + + ai/ + provider.go + factory.go + prompts.go + types.go + + litellm/ + client.go + embeddings.go + metadata.go + + openrouter/ + client.go + embeddings.go + metadata.go + + mcpserver/ + server.go + transport.go + + tools/ + search.go + list.go + stats.go + capture.go + get.go + update.go + delete.go + archive.go + projects.go + context.go + summarize.go + recall.go + links.go + + store/ + db.go + thoughts.go + stats.go + projects.go + links.go + + metadata/ + schema.go + normalize.go + validate.go + + types/ + thought.go + filters.go + + observability/ + logger.go + metrics.go + tracing.go + + migrations/ + 001_enable_vector.sql + 002_create_thoughts.sql + 003_add_projects.sql + 004_create_thought_links.sql + 005_create_match_thoughts.sql + 006_rls_and_grants.sql + + configs/ + config.example.yaml + dev.yaml + + scripts/ + run-local.sh + migrate.sh + + go.mod + README.md +``` + +--- + +## Dependencies + +### Required Go packages + +* `github.com/modelcontextprotocol/go-sdk` +* `github.com/jackc/pgx/v5` +* `github.com/pgvector/pgvector-go` +* `gopkg.in/yaml.v3` +* `github.com/go-playground/validator/v10` +* `github.com/google/uuid` + +### Standard library usage + +* `net/http` +* `context` +* `log/slog` +* `time` +* `encoding/json` + +The Go MCP SDK is the right fit for implementing an MCP server in Go, and `pgvector-go` is the expected library for Go integration with pgvector-backed Postgres columns. + +--- + +## Config model + +Config files are the primary source of truth. + +### Rules + +* use **YAML config files** +* allow **environment overrides** +* do **not** commit real secrets +* commit only `config.example.yaml` +* keep local secrets in ignored files +* in production, mount config files as secrets or use env overrides for sensitive values + +--- + +## Example config + +```yaml +server: + host: "0.0.0.0" + port: 8080 + read_timeout: "15s" + write_timeout: "30s" + idle_timeout: "60s" + allowed_origins: + - "*" + +mcp: + path: "/mcp" + server_name: "open-brain" + version: "1.0.0" + transport: "streamable_http" + +auth: + mode: "api_keys" + header_name: "x-brain-key" + query_param: "key" + allow_query_param: false + keys: + - id: "local-client" + value: "replace-me" + description: "main local client key" + +database: + url: "postgres://user:pass@localhost:5432/ob1?sslmode=disable" + max_conns: 10 + min_conns: 2 + max_conn_lifetime: "30m" + max_conn_idle_time: "10m" + +ai: + provider: "litellm" + + embeddings: + model: "openai/text-embedding-3-small" + dimensions: 1536 + + metadata: + model: "gpt-4o-mini" + temperature: 0.1 + + litellm: + base_url: "http://localhost:4000/v1" + api_key: "replace-me" + use_responses_api: false + request_headers: {} + embedding_model: "openrouter/openai/text-embedding-3-small" + metadata_model: "gpt-4o-mini" + + openrouter: + base_url: "https://openrouter.ai/api/v1" + api_key: "" + app_name: "ob1-go" + site_url: "" + extra_headers: {} + +capture: + source: "mcp" + metadata_defaults: + type: "observation" + topic_fallback: "uncategorized" + +search: + default_limit: 10 + default_threshold: 0.5 + max_limit: 50 + +logging: + level: "info" + format: "json" + +observability: + metrics_enabled: true + pprof_enabled: false +``` + +--- + +## Config structs + +```go +type Config struct { + Server ServerConfig `yaml:"server"` + MCP MCPConfig `yaml:"mcp"` + Auth AuthConfig `yaml:"auth"` + Database DatabaseConfig `yaml:"database"` + AI AIConfig `yaml:"ai"` + Capture CaptureConfig `yaml:"capture"` + Search SearchConfig `yaml:"search"` + Logging LoggingConfig `yaml:"logging"` + Observability ObservabilityConfig `yaml:"observability"` +} + +type ServerConfig struct { + Host string `yaml:"host"` + Port int `yaml:"port"` + ReadTimeout time.Duration `yaml:"read_timeout"` + WriteTimeout time.Duration `yaml:"write_timeout"` + IdleTimeout time.Duration `yaml:"idle_timeout"` + AllowedOrigins []string `yaml:"allowed_origins"` +} + +type MCPConfig struct { + Path string `yaml:"path"` + ServerName string `yaml:"server_name"` + Version string `yaml:"version"` + Transport string `yaml:"transport"` +} + +type AuthConfig struct { + Mode string `yaml:"mode"` + HeaderName string `yaml:"header_name"` + QueryParam string `yaml:"query_param"` + AllowQueryParam bool `yaml:"allow_query_param"` + Keys []APIKey `yaml:"keys"` +} + +type APIKey struct { + ID string `yaml:"id"` + Value string `yaml:"value"` + Description string `yaml:"description"` +} + +type DatabaseConfig struct { + URL string `yaml:"url"` + MaxConns int32 `yaml:"max_conns"` + MinConns int32 `yaml:"min_conns"` + MaxConnLifetime time.Duration `yaml:"max_conn_lifetime"` + MaxConnIdleTime time.Duration `yaml:"max_conn_idle_time"` +} + +type AIConfig struct { + Provider string `yaml:"provider"` // litellm | openrouter + Embeddings AIEmbeddingConfig `yaml:"embeddings"` + Metadata AIMetadataConfig `yaml:"metadata"` + LiteLLM LiteLLMConfig `yaml:"litellm"` + OpenRouter OpenRouterAIConfig `yaml:"openrouter"` +} + +type AIEmbeddingConfig struct { + Model string `yaml:"model"` + Dimensions int `yaml:"dimensions"` +} + +type AIMetadataConfig struct { + Model string `yaml:"model"` + Temperature float64 `yaml:"temperature"` +} + +type LiteLLMConfig struct { + BaseURL string `yaml:"base_url"` + APIKey string `yaml:"api_key"` + UseResponsesAPI bool `yaml:"use_responses_api"` + RequestHeaders map[string]string `yaml:"request_headers"` + EmbeddingModel string `yaml:"embedding_model"` + MetadataModel string `yaml:"metadata_model"` +} + +type OpenRouterAIConfig struct { + BaseURL string `yaml:"base_url"` + APIKey string `yaml:"api_key"` + AppName string `yaml:"app_name"` + SiteURL string `yaml:"site_url"` + ExtraHeaders map[string]string `yaml:"extra_headers"` +} +``` + +--- + +## Config precedence + +### Order + +1. `--config /path/to/file.yaml` +2. `OB1_CONFIG` +3. default `./configs/dev.yaml` +4. environment overrides for specific fields + +### Suggested env overrides + +* `OB1_DATABASE_URL` +* `OB1_LITELLM_API_KEY` +* `OB1_OPENROUTER_API_KEY` +* `OB1_SERVER_PORT` + +--- + +## Validation rules + +At startup, fail fast if: + +* `database.url` is empty +* `auth.keys` is empty +* `mcp.path` is empty +* `ai.provider` is unsupported +* `ai.embeddings.dimensions <= 0` +* provider-specific base URL or API key is missing +* the DB vector dimension does not match configured embedding dimensions + +--- + +## AI provider design + +### Provider interface + +```go +type Provider interface { + Embed(ctx context.Context, input string) ([]float32, error) + ExtractMetadata(ctx context.Context, input string) (ThoughtMetadata, error) + Name() string +} +``` + +### Factory + +```go +func NewProvider(cfg AIConfig, httpClient *http.Client, log *slog.Logger) (Provider, error) { + switch cfg.Provider { + case "litellm": + return litellm.New(cfg, httpClient, log) + case "openrouter": + return openrouter.New(cfg, httpClient, log) + default: + return nil, fmt.Errorf("unsupported ai.provider: %s", cfg.Provider) + } +} +``` + +--- + +## LiteLLM-first behavior + +### Embeddings + +The app will call LiteLLM at: + +* `POST /v1/embeddings` + +using an OpenAI-compatible request payload and Bearer auth. LiteLLM documents its proxy embeddings support through OpenAI-compatible endpoints. + +### Metadata extraction + +The app will call LiteLLM at: + +* `POST /v1/chat/completions` + +using: + +* configured metadata model +* system prompt +* user message +* JSON-oriented response handling + +LiteLLM’s proxy is intended to accept OpenAI-style chat completion requests. + +### Model routing + +In config, use: + +* `litellm.embedding_model` +* `litellm.metadata_model` + +These may be: + +* direct model names +* LiteLLM aliases +* OpenRouter-backed model identifiers + +Example: + +```yaml +litellm: + embedding_model: "openrouter/openai/text-embedding-3-small" + metadata_model: "gpt-4o-mini" +``` + +LiteLLM documents OpenRouter provider usage and OpenRouter-backed model naming. + +--- + +## OpenRouter fallback mode + +If `ai.provider: openrouter`, the app will directly call: + +* `POST /api/v1/embeddings` +* `POST /api/v1/chat/completions` + +with Bearer auth. + +OpenRouter documents the embeddings endpoint and its authentication model. + +This mode is mainly for: + +* local simplicity +* debugging provider issues +* deployments without LiteLLM + +--- + +## Metadata schema + +Use one stable metadata schema regardless of provider. + +```go +type ThoughtMetadata struct { + People []string `json:"people"` + ActionItems []string `json:"action_items"` + DatesMentioned []string `json:"dates_mentioned"` + Topics []string `json:"topics"` + Type string `json:"type"` + Source string `json:"source"` +} +``` + +### Accepted type values + +* `observation` +* `task` +* `idea` +* `reference` +* `person_note` + +### Normalization rules + +* trim all strings +* deduplicate arrays +* drop empty values +* cap topics count if needed +* default invalid `type` to `observation` +* set `source: "mcp"` for MCP-captured thoughts + +### Fallback defaults + +If metadata extraction fails: + +```json +{ + "people": [], + "action_items": [], + "dates_mentioned": [], + "topics": ["uncategorized"], + "type": "observation", + "source": "mcp" +} +``` + +--- + +## Database design + +The DB contract should match the current OB1 structure as closely as possible: + +* `thoughts` table +* `embedding vector(1536)` +* HNSW index +* metadata JSONB +* `match_thoughts(...)` function + +--- + +## Migrations + +### `001_enable_vector.sql` + +```sql +create extension if not exists vector; +``` + +### `002_create_thoughts.sql` + +```sql +create table if not exists thoughts ( + id uuid default gen_random_uuid() primary key, + content text not null, + embedding vector(1536), + metadata jsonb default '{}'::jsonb, + created_at timestamptz default now(), + updated_at timestamptz default now() +); + +create index if not exists thoughts_embedding_hnsw_idx + on thoughts using hnsw (embedding vector_cosine_ops); + +create index if not exists thoughts_metadata_gin_idx + on thoughts using gin (metadata); + +create index if not exists thoughts_created_at_idx + on thoughts (created_at desc); +``` + +### `003_add_projects.sql` + +```sql +create table if not exists projects ( + id uuid default gen_random_uuid() primary key, + name text not null unique, + description text, + created_at timestamptz default now(), + last_active_at timestamptz default now() +); + +alter table thoughts add column if not exists project_id uuid references projects(id); +alter table thoughts add column if not exists archived_at timestamptz; + +create index if not exists thoughts_project_id_idx on thoughts (project_id); +create index if not exists thoughts_archived_at_idx on thoughts (archived_at); +``` + +### `004_create_thought_links.sql` + +```sql +create table if not exists thought_links ( + from_id uuid references thoughts(id) on delete cascade, + to_id uuid references thoughts(id) on delete cascade, + relation text not null, + created_at timestamptz default now(), + primary key (from_id, to_id, relation) +); + +create index if not exists thought_links_from_idx on thought_links (from_id); +create index if not exists thought_links_to_idx on thought_links (to_id); +``` + +### `005_create_match_thoughts.sql` + +```sql +create or replace function match_thoughts( + query_embedding vector(1536), + match_threshold float default 0.7, + match_count int default 10, + filter jsonb default '{}'::jsonb +) +returns table ( + id uuid, + content text, + metadata jsonb, + similarity float, + created_at timestamptz +) +language plpgsql +as $$ +begin + return query + select + t.id, + t.content, + t.metadata, + 1 - (t.embedding <=> query_embedding) as similarity, + t.created_at + from thoughts t + where 1 - (t.embedding <=> query_embedding) > match_threshold + and (filter = '{}'::jsonb or t.metadata @> filter) + order by t.embedding <=> query_embedding + limit match_count; +end; +$$; +``` + +### `006_rls_and_grants.sql` + +```sql +-- Grant full access to the application database user configured in database.url. +-- Replace 'ob1_user' with the actual role name used in your database.url. +grant select, insert, update, delete on table public.thoughts to ob1_user; +grant select, insert, update, delete on table public.projects to ob1_user; +grant select, insert, update, delete on table public.thought_links to ob1_user; +``` + +--- + +## Store layer + +### Interfaces + +```go +type ThoughtStore interface { + InsertThought(ctx context.Context, thought Thought) error + GetThought(ctx context.Context, id uuid.UUID) (Thought, error) + UpdateThought(ctx context.Context, id uuid.UUID, patch ThoughtPatch) (Thought, error) + DeleteThought(ctx context.Context, id uuid.UUID) error + ArchiveThought(ctx context.Context, id uuid.UUID) error + SearchThoughts(ctx context.Context, embedding []float32, threshold float64, limit int, filter map[string]any) ([]SearchResult, error) + ListThoughts(ctx context.Context, filter ListFilter) ([]Thought, error) + Stats(ctx context.Context) (ThoughtStats, error) +} + +type ProjectStore interface { + InsertProject(ctx context.Context, project Project) error + GetProject(ctx context.Context, nameOrID string) (Project, error) + ListProjects(ctx context.Context) ([]ProjectSummary, error) + TouchProject(ctx context.Context, id uuid.UUID) error +} + +type LinkStore interface { + InsertLink(ctx context.Context, link ThoughtLink) error + GetLinks(ctx context.Context, thoughtID uuid.UUID) ([]ThoughtLink, error) +} +``` + +### DB implementation notes + +Use `pgxpool.Pool`. + +On startup: + +* parse DB config +* create pool +* register pgvector support +* ping DB +* verify required function exists +* verify vector extension exists + +--- + +## Domain types + +```go +type Thought struct { + ID uuid.UUID + Content string + Embedding []float32 + Metadata ThoughtMetadata + ProjectID *uuid.UUID + ArchivedAt *time.Time + CreatedAt time.Time + UpdatedAt time.Time +} + +type SearchResult struct { + ID uuid.UUID + Content string + Metadata ThoughtMetadata + Similarity float64 + CreatedAt time.Time +} + +type ListFilter struct { + Limit int + Type string + Topic string + Person string + Days int + ProjectID *uuid.UUID + IncludeArchived bool +} + +type ThoughtStats struct { + TotalCount int + TypeCounts map[string]int + TopTopics []KeyCount + TopPeople []KeyCount +} + +type ThoughtPatch struct { + Content *string + Metadata *ThoughtMetadata +} + +type Project struct { + ID uuid.UUID + Name string + Description string + CreatedAt time.Time + LastActiveAt time.Time +} + +type ProjectSummary struct { + Project + ThoughtCount int +} + +type ThoughtLink struct { + FromID uuid.UUID + ToID uuid.UUID + Relation string + CreatedAt time.Time +} +``` + +--- + +## Auth design + +The reference OB1 implementation uses a configured access key and accepts it via header or query param. + +We will keep compatibility but make it cleaner. + +### Auth behavior + +* primary auth via header, default: `x-brain-key` +* optional query param fallback +* support multiple keys in config +* attach key ID to request context for auditing + +### Middleware flow + +1. read configured header +2. if missing and allowed, read query param +3. compare against in-memory keyring +4. if matched, attach key ID to request context +5. else return `401 Unauthorized` + +### Recommendation + +Set: + +```yaml +auth: + allow_query_param: false +``` + +for production. + +--- + +## MCP server design + +Expose MCP over Streamable HTTP. MCP’s spec defines Streamable HTTP as the remote transport replacing the older HTTP+SSE approach. + +### HTTP routes + +* `POST /mcp` +* `GET /healthz` +* `GET /readyz` + +### Middleware stack + +* request ID +* panic recovery +* structured logging +* auth +* timeout +* optional CORS + +--- + +## MCP tools + +### 1. `capture_thought` + +**Input** + +* `content string` + +**Flow** + +1. validate content +2. concurrently: + + * call provider `Embed` + * call provider `ExtractMetadata` +3. normalize metadata +4. set `source = "mcp"` +5. insert into `thoughts` +6. return success payload + +### 2. `search_thoughts` + +**Input** + +* `query string` +* `limit int` +* `threshold float` + +**Flow** + +1. embed query +2. call `match_thoughts(...)` +3. format ranked results +4. return results + +### 3. `list_thoughts` + +**Input** + +* `limit` +* `type` +* `topic` +* `person` +* `days` + +**Flow** + +1. build SQL filters +2. query `thoughts` +3. order by `created_at desc` +4. return summaries + +### 4. `thought_stats` + +**Input** + +* none + +**Flow** + +1. count rows +2. aggregate metadata usage +3. return totals and top buckets + +--- + +### 5. `get_thought` + +**Input** + +* `id string` + +**Flow** + +1. validate UUID +2. query `thoughts` by ID +3. return full record or not-found error + +--- + +### 6. `update_thought` + +**Input** + +* `id string` +* `content string` (optional) +* `metadata map` (optional, merged not replaced) + +**Flow** + +1. validate inputs +2. if content provided: re-embed and re-extract metadata +3. merge metadata patch +4. update row, set `updated_at` +5. return updated record + +--- + +### 7. `delete_thought` + +**Input** + +* `id string` + +**Flow** + +1. validate UUID +2. hard-delete row (cascades to `thought_links`) +3. return confirmation + +--- + +### 8. `archive_thought` + +**Input** + +* `id string` + +**Flow** + +1. validate UUID +2. set `archived_at = now()` +3. return confirmation + +Note: archived thoughts are excluded from search and list results by default unless `include_archived: true` is passed. + +--- + +### 9. `create_project` + +**Input** + +* `name string` +* `description string` (optional) + +**Flow** + +1. validate name uniqueness +2. insert into `projects` +3. return project record + +--- + +### 10. `list_projects` + +**Input** + +* none + +**Flow** + +1. query `projects` ordered by `last_active_at desc` +2. join thought counts per project +3. return summaries + +--- + +### 11. `get_project_context` + +**Input** + +* `project string` (name or ID) +* `query string` (optional, semantic focus) +* `limit int` + +**Flow** + +1. resolve project +2. fetch recent thoughts in project (last N) +3. if query provided: semantic search scoped to project +4. merge and deduplicate results ranked by recency + similarity +5. update `projects.last_active_at` +6. return context block ready for injection + +--- + +### 12. `set_active_project` + +**Input** + +* `project string` (name or ID) + +**Flow** + +1. resolve project +2. store project ID in server session context (in-memory, per connection) +3. return confirmation + +--- + +### 13. `get_active_project` + +**Input** + +* none + +**Flow** + +1. return current session active project or null + +--- + +### 14. `summarize_thoughts` + +**Input** + +* `query string` (optional topic focus) +* `project string` (optional) +* `days int` (optional time window) +* `limit int` + +**Flow** + +1. fetch matching thoughts via search or filter +2. format as context +3. call AI provider to produce prose summary +4. return summary text + +--- + +### 15. `recall_context` + +**Input** + +* `query string` +* `project string` (optional) +* `limit int` + +**Flow** + +1. semantic search with optional project filter +2. recency boost: merge with most recent N thoughts from project +3. deduplicate and rank +4. return formatted context block suitable for pasting into a new conversation + +--- + +### 16. `link_thoughts` + +**Input** + +* `from_id string` +* `to_id string` +* `relation string` (e.g. `follows_up`, `contradicts`, `references`, `blocks`) + +**Flow** + +1. validate both IDs exist +2. insert into `thought_links` +3. return confirmation + +--- + +### 17. `related_thoughts` + +**Input** + +* `id string` +* `include_semantic bool` (default true) + +**Flow** + +1. fetch explicit links from `thought_links` for this ID +2. if `include_semantic`: also fetch nearest semantic neighbours +3. merge, deduplicate, return with relation type or similarity score + +--- + +## Tool package plan + +### `internal/tools/capture.go` + +Responsibilities: + +* input validation +* parallel embed + metadata extraction +* normalization +* write to store + +### `internal/tools/search.go` + +Responsibilities: + +* input validation +* embed query +* vector search +* output formatting + +### `internal/tools/list.go` + +Responsibilities: + +* filter normalization +* DB read +* output formatting + +### `internal/tools/stats.go` + +Responsibilities: + +* fetch/aggregate stats +* output shaping + +### `internal/tools/get.go` + +Responsibilities: + +* UUID validation +* single thought retrieval + +### `internal/tools/update.go` + +Responsibilities: + +* partial content/metadata update +* conditional re-embed if content changed +* metadata merge + +### `internal/tools/delete.go` + +Responsibilities: + +* UUID validation +* hard delete + +### `internal/tools/archive.go` + +Responsibilities: + +* UUID validation +* set `archived_at` + +### `internal/tools/projects.go` + +Responsibilities: + +* `create_project`, `list_projects` +* `set_active_project`, `get_active_project` (session context) + +### `internal/tools/context.go` + +Responsibilities: + +* `get_project_context`: resolve project, combine recency + semantic search, return context block +* update `last_active_at` on access + +### `internal/tools/summarize.go` + +Responsibilities: + +* filter/search thoughts +* format as prompt context +* call AI provider for prose summary + +### `internal/tools/recall.go` + +Responsibilities: + +* `recall_context`: semantic search + recency boost + project filter +* output formatted context block + +### `internal/tools/links.go` + +Responsibilities: + +* `link_thoughts`: validate both IDs, insert link +* `related_thoughts`: fetch explicit links + optional semantic neighbours, merge and return + +--- + +## Startup sequence + +1. parse CLI args +2. load config file +3. apply env overrides +4. validate config +5. initialize logger +6. create DB pool +7. verify DB requirements +8. create AI provider +9. create store +10. create tool handlers +11. register MCP tools +12. start HTTP server + +--- + +## Error handling policy + +### Fail fast on startup errors + +* invalid config +* DB unavailable +* missing required API keys +* invalid MCP config +* provider initialization failure + +### Retry policy for provider calls + +Retry on: + +* `429` +* `500` +* `502` +* `503` +* timeout +* connection reset + +Do not retry on: + +* malformed request +* auth failure +* invalid model name +* invalid response shape after repeated attempts + +Use: + +* exponential backoff +* capped retries +* context-aware cancellation + +--- + +## Observability + +### Logging + +Use `log/slog` in JSON mode. + +Include: + +* request ID +* route +* tool name +* key ID +* provider name +* DB latency +* upstream latency +* error class + +### Metrics + +Track: + +* request count by tool +* request duration +* provider call duration +* DB query duration +* auth failures +* provider failures +* insert/search counts + +### Health checks + +#### `/healthz` + +Returns OK if process is running. + +#### `/readyz` + +Returns OK only if: + +* DB is reachable +* provider config is valid +* optional provider probe passes + +--- + +## Security plan + +### Secrets handling + +* keep secrets in config files only for local/dev use +* never commit real secrets +* use mounted secret files or env overrides in production + +### API key policy + +* support multiple keys +* identify keys by ID +* allow key rotation by config update + restart +* log only key ID, never raw value + +### Transport + +* run behind TLS terminator in production +* disable query-param auth in production +* avoid logging full URLs when query-param auth is enabled + +--- + +## Testing plan + +## Unit tests + +### Config + +* valid config loads +* invalid config fails +* env overrides apply correctly + +### Auth + +* header auth success +* query auth success +* invalid key rejected +* disabled query auth rejected + +### Metadata + +* normalization works +* invalid types default correctly +* empty metadata falls back safely + +### AI provider parsing + +* LiteLLM embeddings parse correctly +* LiteLLM chat completions parse correctly +* provider errors classified correctly + +### Store + +* filter builders generate expected SQL fragments +* JSONB metadata handling is stable + +--- + +## Integration tests + +Run against local Postgres with pgvector. + +Test: + +* migrations apply cleanly +* insert thought +* search thought +* list thoughts with filters +* stats aggregation +* auth-protected MCP route +* LiteLLM mock/proxy compatibility + +--- + +## Manual acceptance tests + +1. start local Postgres + pgvector +2. start LiteLLM +3. configure LiteLLM to route embeddings to OpenRouter +4. start Go server +5. connect MCP client +6. call `capture_thought` +7. call `search_thoughts` +8. call `list_thoughts` +9. call `thought_stats` +10. rotate API key and verify restart behavior + +--- + +## Milestones + +## Milestone 1 — foundation + +Deliver: + +* repo skeleton +* config loader +* config validation +* logger +* DB connection +* migrations + +Exit criteria: + +* app starts +* DB connection verified +* config-driven startup works + +--- + +## Milestone 2 — AI provider layer + +Deliver: + +* provider interface +* LiteLLM implementation +* OpenRouter fallback implementation +* metadata prompt +* normalization + +Exit criteria: + +* successful embedding call through LiteLLM +* successful metadata extraction through LiteLLM +* vector length validation works + +--- + +## Milestone 3 — capture and search + +Deliver: + +* `capture_thought` +* `search_thoughts` +* store methods for insert and vector search + +Exit criteria: + +* thoughts can be captured end-to-end +* semantic search returns results + +--- + +## Milestone 4 — remaining tools + +Deliver: + +* `list_thoughts` +* `thought_stats` + +Exit criteria: + +* all four tools function through MCP + +--- + +## Milestone 5 — extended memory and project tools + +Deliver: + +* `get_thought`, `update_thought`, `delete_thought`, `archive_thought` +* `create_project`, `list_projects`, `set_active_project`, `get_active_project` +* `get_project_context`, `recall_context` +* `summarize_thoughts` +* `link_thoughts`, `related_thoughts` +* migrations 003 and 004 (projects + links tables) + +Exit criteria: + +* thoughts can be retrieved, patched, deleted, archived +* projects can be created and listed +* `get_project_context` returns a usable context block +* `summarize_thoughts` produces a prose summary via the AI provider +* thought links can be created and retrieved with semantic neighbours + +--- + +## Milestone 6 — HTTP and auth hardening + +Deliver: + +* auth middleware +* health endpoints +* structured logs +* retries +* timeouts + +Exit criteria: + +* endpoint protected +* logs useful +* service stable under expected failures + +--- + +## Milestone 7 — production readiness + +Deliver: + +* metrics +* readiness checks +* key rotation workflow +* deployment docs + +Exit criteria: + +* production deployment is straightforward +* operational playbook exists + +--- + +## Implementation order + +Build in this order: + +1. config +2. DB + migrations (001, 002, 003, 004) +3. LiteLLM client +4. metadata normalization +5. `capture_thought` +6. `search_thoughts` +7. MCP HTTP server +8. auth middleware +9. `list_thoughts` +10. `thought_stats` +11. `get_thought`, `update_thought`, `delete_thought`, `archive_thought` +12. `create_project`, `list_projects`, `set_active_project`, `get_active_project` +13. `get_project_context`, `recall_context` +14. `summarize_thoughts` +15. `link_thoughts`, `related_thoughts` +16. logs/metrics/health + +This gives usable value early and builds the project/memory layer on a solid foundation. + +--- + +## Recommended local development stack + +### Services + +* Postgres with pgvector +* LiteLLM proxy +* optional OpenRouter upstream +* Go service + +### Example shape + +```text +docker-compose: + postgres + litellm + ob1-go +``` + +--- + +## Recommended production deployment + +### Preferred architecture + +* Go service on Fly.io / Cloud Run / Render +* LiteLLM as separate service +* Postgres managed externally +* TLS terminator in front + +### Why not Edge Functions + +The original repo uses Deno Edge Functions because of its chosen deployment environment, but the app behavior is better suited to a normal long-running Go service for maintainability and observability. + +--- + +## Risks and decisions + +### Risk: embedding dimension mismatch + +Mitigation: + +* validate config vs DB on startup + +### Risk: LiteLLM model alias drift + +Mitigation: + +* add readiness probe for configured models + +### Risk: metadata extraction instability + +Mitigation: + +* strong normalization + safe defaults + +### Risk: single global auth model + +Mitigation: + +* acceptable for v1 +* redesign for multi-tenant later + +### Risk: stats scaling poorly + +Mitigation: + +* start with in-memory aggregation +* move to SQL aggregation if needed + +--- + +## Definition of done for v1 + +The project is done when: + +* service starts from YAML config +* LiteLLM is the primary AI provider +* OpenRouter can be used behind LiteLLM +* direct OpenRouter mode still works +* MCP endpoint is authenticated +* `capture_thought` stores content, embedding, metadata +* `search_thoughts` performs semantic search +* `list_thoughts` supports filtering +* `thought_stats` returns useful summaries +* thoughts can be retrieved, updated, deleted, and archived +* projects can be created, listed, and used to scope captures and searches +* `get_project_context` returns a ready-to-use context block +* `recall_context` returns a semantically relevant + recent context block +* `summarize_thoughts` produces prose summaries via the AI provider +* thought links can be created and traversed with semantic fallback +* logs and health checks exist +* key rotation works via config + restart + +--- + +## Recommendation + +Build this as a **boring Go service**: + +* stdlib HTTP +* thin MCP server layer +* thin provider layer +* thin store layer +* YAML config +* explicit interfaces + +Do not over-abstract it. The product shape is simple. The goal is a reliable, understandable service with LiteLLM as the stable provider boundary. + +--- + +## Next implementation artifact + +The next concrete deliverable should be a **starter Go repo skeleton** containing: + +* `go.mod` +* folder structure +* `main.go` +* config loader +* example config +* migration files (001–006) +* provider interface +* LiteLLM client skeleton +* store interfaces (`ThoughtStore`, `ProjectStore`, `LinkStore`) +* domain types including `Project`, `ThoughtLink`, `ThoughtPatch` +* MCP tool registration stubs for all 17 tools