35 KiB
Avalon Memory Crystal Server (amcs)
OB1 in Go — LiteLLM-First Implementation Plan
Based of the Open Brain project. Reference it for detail: https://github.com/NateBJones-Projects/OB1
Objective
Build a Go implementation of the OB1 project with:
-
LiteLLM as the primary AI provider
-
OpenRouter as the default upstream behind LiteLLM
-
config-file-based keys and auth tokens
-
MCP over Streamable HTTP
-
Postgres with pgvector
-
parity with the current OB1 toolset:
search_thoughtslist_thoughtsthought_statscapture_thought
-
extended toolset for memory and project management:
get_thoughtupdate_thoughtdelete_thoughtarchive_thoughtcreate_projectlist_projectsget_project_contextset_active_projectget_active_projectsummarize_thoughtsrecall_contextlink_thoughtsrelated_thoughts
The current OB1 reference implementation is a small MCP server backed by a thoughts table in Postgres, a match_thoughts(...) vector-search function, and OpenRouter calls for embeddings plus metadata extraction.
Why LiteLLM should be the primary provider
LiteLLM is the right primary abstraction because it gives one stable OpenAI-compatible API surface while allowing routing to multiple upstream providers, including OpenRouter. LiteLLM documents OpenAI-compatible proxy endpoints, including /v1/embeddings, and explicitly supports OpenRouter-backed models.
That gives us:
- one provider contract in the Go app
- centralized key management
- easier model swaps
- support for multiple upstreams later
- simpler production operations
Provider strategy
Primary runtime mode
- App -> LiteLLM Proxy -> OpenRouter / other providers
Fallback mode
- App -> OpenRouter directly
We will support both, but the codebase will be designed LiteLLM-first.
Scope of the Go service
The Go service will provide:
- MCP server over Streamable HTTP
- API-key authentication
- thought capture
- semantic search
- thought listing
- thought statistics
- thought lifecycle management (get, update, delete, archive)
- project grouping and context (create, list, get context, active project)
- memory summarization and context recall
- thought linking and relationship traversal
- provider abstraction for embeddings + metadata extraction
- config-file-driven startup
Functional requirements
Required v1 features
-
start from YAML config
-
connect to Postgres
-
use pgvector for embeddings
-
call LiteLLM for:
- embeddings
- metadata extraction
-
expose MCP tools over HTTP
-
protect the MCP endpoint with configured API keys
-
preserve OB1-compatible tool semantics
-
store thoughts with metadata and embeddings
-
search via
match_thoughts(...)
Deferred features
- Slack ingestion
- webhook ingestion
- async metadata extraction
- per-user tenancy
- admin UI
- background enrichment jobs
- multi-provider routing policy inside the app
Reference behavior to preserve
The current OB1 server is small and direct:
- it stores thoughts in a
thoughtstable - it uses vector similarity for semantic search
- it exposes four MCP tools
- it uses an access key for auth
- it generates embeddings and extracts metadata via OpenRouter.
The Go version should preserve those behaviors first, then improve structure and operability.
Architecture
+----------------------+
| MCP Client / AI App |
+----------+-----------+
|
| Streamable HTTP
v
+----------------------+
| Go OB1 Server |
| auth + MCP tools |
+----+-----------+-----+
| |
| |
v v
+----------------+ +----------------------+
| LiteLLM Proxy | | Postgres + pgvector |
| embeddings | | thoughts + pgvector |
| metadata | | RPC/search SQL |
+--------+-------+ +----------------------+
|
v
+-------------+
| OpenRouter |
| or others |
+-------------+
High-level design
Core components
-
Config subsystem
- load YAML
- apply env overrides
- validate required fields
-
Auth subsystem
- API-key validation
- header-based auth
- optional query-param auth
-
AI provider subsystem
- provider interface
- LiteLLM implementation
- optional OpenRouter direct implementation
-
Store subsystem
- Postgres connection pool
- insert/search/list/stats operations
- pgvector support
-
MCP subsystem
- MCP server
- tool registration
- HTTP transport
-
Observability subsystem
- structured logs
- metrics
- health checks
Project layout
ob1-go/
cmd/
ob1-server/
main.go
internal/
app/
app.go
config/
config.go
loader.go
validate.go
auth/
middleware.go
keyring.go
ai/
provider.go
factory.go
prompts.go
types.go
litellm/
client.go
embeddings.go
metadata.go
openrouter/
client.go
embeddings.go
metadata.go
mcpserver/
server.go
transport.go
tools/
search.go
list.go
stats.go
capture.go
get.go
update.go
delete.go
archive.go
projects.go
context.go
summarize.go
recall.go
links.go
store/
db.go
thoughts.go
stats.go
projects.go
links.go
metadata/
schema.go
normalize.go
validate.go
types/
thought.go
filters.go
observability/
logger.go
metrics.go
tracing.go
migrations/
001_enable_vector.sql
002_create_thoughts.sql
003_add_projects.sql
004_create_thought_links.sql
005_create_match_thoughts.sql
006_rls_and_grants.sql
configs/
config.example.yaml
dev.yaml
scripts/
run-local.sh
migrate.sh
go.mod
README.md
Dependencies
Required Go packages
github.com/modelcontextprotocol/go-sdkgithub.com/jackc/pgx/v5github.com/pgvector/pgvector-gogopkg.in/yaml.v3github.com/go-playground/validator/v10github.com/google/uuid
Standard library usage
net/httpcontextlog/slogtimeencoding/json
The Go MCP SDK is the right fit for implementing an MCP server in Go, and pgvector-go is the expected library for Go integration with pgvector-backed Postgres columns.
Config model
Config files are the primary source of truth.
Rules
- use YAML config files
- allow environment overrides
- do not commit real secrets
- commit only
config.example.yaml - keep local secrets in ignored files
- in production, mount config files as secrets or use env overrides for sensitive values
Example config
server:
host: "0.0.0.0"
port: 8080
read_timeout: "15s"
write_timeout: "30s"
idle_timeout: "60s"
allowed_origins:
- "*"
mcp:
path: "/mcp"
server_name: "open-brain"
version: "1.0.0"
transport: "streamable_http"
auth:
mode: "api_keys"
header_name: "x-brain-key"
query_param: "key"
allow_query_param: false
keys:
- id: "local-client"
value: "replace-me"
description: "main local client key"
database:
url: "postgres://user:pass@localhost:5432/ob1?sslmode=disable"
max_conns: 10
min_conns: 2
max_conn_lifetime: "30m"
max_conn_idle_time: "10m"
ai:
provider: "litellm"
embeddings:
model: "openai/text-embedding-3-small"
dimensions: 1536
metadata:
model: "gpt-4o-mini"
temperature: 0.1
litellm:
base_url: "http://localhost:4000/v1"
api_key: "replace-me"
use_responses_api: false
request_headers: {}
embedding_model: "openrouter/openai/text-embedding-3-small"
metadata_model: "gpt-4o-mini"
openrouter:
base_url: "https://openrouter.ai/api/v1"
api_key: ""
app_name: "ob1-go"
site_url: ""
extra_headers: {}
capture:
source: "mcp"
metadata_defaults:
type: "observation"
topic_fallback: "uncategorized"
search:
default_limit: 10
default_threshold: 0.5
max_limit: 50
logging:
level: "info"
format: "json"
observability:
metrics_enabled: true
pprof_enabled: false
Config structs
type Config struct {
Server ServerConfig `yaml:"server"`
MCP MCPConfig `yaml:"mcp"`
Auth AuthConfig `yaml:"auth"`
Database DatabaseConfig `yaml:"database"`
AI AIConfig `yaml:"ai"`
Capture CaptureConfig `yaml:"capture"`
Search SearchConfig `yaml:"search"`
Logging LoggingConfig `yaml:"logging"`
Observability ObservabilityConfig `yaml:"observability"`
}
type ServerConfig struct {
Host string `yaml:"host"`
Port int `yaml:"port"`
ReadTimeout time.Duration `yaml:"read_timeout"`
WriteTimeout time.Duration `yaml:"write_timeout"`
IdleTimeout time.Duration `yaml:"idle_timeout"`
AllowedOrigins []string `yaml:"allowed_origins"`
}
type MCPConfig struct {
Path string `yaml:"path"`
ServerName string `yaml:"server_name"`
Version string `yaml:"version"`
Transport string `yaml:"transport"`
}
type AuthConfig struct {
Mode string `yaml:"mode"`
HeaderName string `yaml:"header_name"`
QueryParam string `yaml:"query_param"`
AllowQueryParam bool `yaml:"allow_query_param"`
Keys []APIKey `yaml:"keys"`
}
type APIKey struct {
ID string `yaml:"id"`
Value string `yaml:"value"`
Description string `yaml:"description"`
}
type DatabaseConfig struct {
URL string `yaml:"url"`
MaxConns int32 `yaml:"max_conns"`
MinConns int32 `yaml:"min_conns"`
MaxConnLifetime time.Duration `yaml:"max_conn_lifetime"`
MaxConnIdleTime time.Duration `yaml:"max_conn_idle_time"`
}
type AIConfig struct {
Provider string `yaml:"provider"` // litellm | openrouter
Embeddings AIEmbeddingConfig `yaml:"embeddings"`
Metadata AIMetadataConfig `yaml:"metadata"`
LiteLLM LiteLLMConfig `yaml:"litellm"`
OpenRouter OpenRouterAIConfig `yaml:"openrouter"`
}
type AIEmbeddingConfig struct {
Model string `yaml:"model"`
Dimensions int `yaml:"dimensions"`
}
type AIMetadataConfig struct {
Model string `yaml:"model"`
Temperature float64 `yaml:"temperature"`
}
type LiteLLMConfig struct {
BaseURL string `yaml:"base_url"`
APIKey string `yaml:"api_key"`
UseResponsesAPI bool `yaml:"use_responses_api"`
RequestHeaders map[string]string `yaml:"request_headers"`
EmbeddingModel string `yaml:"embedding_model"`
MetadataModel string `yaml:"metadata_model"`
}
type OpenRouterAIConfig struct {
BaseURL string `yaml:"base_url"`
APIKey string `yaml:"api_key"`
AppName string `yaml:"app_name"`
SiteURL string `yaml:"site_url"`
ExtraHeaders map[string]string `yaml:"extra_headers"`
}
Config precedence
Order
--config /path/to/file.yamlOB1_CONFIG- default
./configs/dev.yaml - environment overrides for specific fields
Suggested env overrides
OB1_DATABASE_URLOB1_LITELLM_API_KEYOB1_OPENROUTER_API_KEYOB1_SERVER_PORT
Validation rules
At startup, fail fast if:
database.urlis emptyauth.keysis emptymcp.pathis emptyai.provideris unsupportedai.embeddings.dimensions <= 0- provider-specific base URL or API key is missing
- the DB vector dimension does not match configured embedding dimensions
AI provider design
Provider interface
type Provider interface {
Embed(ctx context.Context, input string) ([]float32, error)
ExtractMetadata(ctx context.Context, input string) (ThoughtMetadata, error)
Name() string
}
Factory
func NewProvider(cfg AIConfig, httpClient *http.Client, log *slog.Logger) (Provider, error) {
switch cfg.Provider {
case "litellm":
return litellm.New(cfg, httpClient, log)
case "openrouter":
return openrouter.New(cfg, httpClient, log)
default:
return nil, fmt.Errorf("unsupported ai.provider: %s", cfg.Provider)
}
}
LiteLLM-first behavior
Embeddings
The app will call LiteLLM at:
POST /v1/embeddings
using an OpenAI-compatible request payload and Bearer auth. LiteLLM documents its proxy embeddings support through OpenAI-compatible endpoints.
Metadata extraction
The app will call LiteLLM at:
POST /v1/chat/completions
using:
- configured metadata model
- system prompt
- user message
- JSON-oriented response handling
LiteLLM’s proxy is intended to accept OpenAI-style chat completion requests.
Model routing
In config, use:
litellm.embedding_modellitellm.metadata_model
These may be:
- direct model names
- LiteLLM aliases
- OpenRouter-backed model identifiers
Example:
litellm:
embedding_model: "openrouter/openai/text-embedding-3-small"
metadata_model: "gpt-4o-mini"
LiteLLM documents OpenRouter provider usage and OpenRouter-backed model naming.
OpenRouter fallback mode
If ai.provider: openrouter, the app will directly call:
POST /api/v1/embeddingsPOST /api/v1/chat/completions
with Bearer auth.
OpenRouter documents the embeddings endpoint and its authentication model.
This mode is mainly for:
- local simplicity
- debugging provider issues
- deployments without LiteLLM
Metadata schema
Use one stable metadata schema regardless of provider.
type ThoughtMetadata struct {
People []string `json:"people"`
ActionItems []string `json:"action_items"`
DatesMentioned []string `json:"dates_mentioned"`
Topics []string `json:"topics"`
Type string `json:"type"`
Source string `json:"source"`
}
Accepted type values
observationtaskideareferenceperson_note
Normalization rules
- trim all strings
- deduplicate arrays
- drop empty values
- cap topics count if needed
- default invalid
typetoobservation - set
source: "mcp"for MCP-captured thoughts
Fallback defaults
If metadata extraction fails:
{
"people": [],
"action_items": [],
"dates_mentioned": [],
"topics": ["uncategorized"],
"type": "observation",
"source": "mcp"
}
Database design
The DB contract should match the current OB1 structure as closely as possible:
thoughtstableembedding vector(1536)- HNSW index
- metadata JSONB
match_thoughts(...)function
Migrations
001_enable_vector.sql
create extension if not exists vector;
002_create_thoughts.sql
create table if not exists thoughts (
id uuid default gen_random_uuid() primary key,
content text not null,
embedding vector(1536),
metadata jsonb default '{}'::jsonb,
created_at timestamptz default now(),
updated_at timestamptz default now()
);
create index if not exists thoughts_embedding_hnsw_idx
on thoughts using hnsw (embedding vector_cosine_ops);
create index if not exists thoughts_metadata_gin_idx
on thoughts using gin (metadata);
create index if not exists thoughts_created_at_idx
on thoughts (created_at desc);
003_add_projects.sql
create table if not exists projects (
id uuid default gen_random_uuid() primary key,
name text not null unique,
description text,
created_at timestamptz default now(),
last_active_at timestamptz default now()
);
alter table thoughts add column if not exists project_id uuid references projects(id);
alter table thoughts add column if not exists archived_at timestamptz;
create index if not exists thoughts_project_id_idx on thoughts (project_id);
create index if not exists thoughts_archived_at_idx on thoughts (archived_at);
004_create_thought_links.sql
create table if not exists thought_links (
from_id uuid references thoughts(id) on delete cascade,
to_id uuid references thoughts(id) on delete cascade,
relation text not null,
created_at timestamptz default now(),
primary key (from_id, to_id, relation)
);
create index if not exists thought_links_from_idx on thought_links (from_id);
create index if not exists thought_links_to_idx on thought_links (to_id);
005_create_match_thoughts.sql
create or replace function match_thoughts(
query_embedding vector(1536),
match_threshold float default 0.7,
match_count int default 10,
filter jsonb default '{}'::jsonb
)
returns table (
id uuid,
content text,
metadata jsonb,
similarity float,
created_at timestamptz
)
language plpgsql
as $$
begin
return query
select
t.id,
t.content,
t.metadata,
1 - (t.embedding <=> query_embedding) as similarity,
t.created_at
from thoughts t
where 1 - (t.embedding <=> query_embedding) > match_threshold
and (filter = '{}'::jsonb or t.metadata @> filter)
order by t.embedding <=> query_embedding
limit match_count;
end;
$$;
006_rls_and_grants.sql
-- Grant full access to the application database user configured in database.url.
-- Replace 'ob1_user' with the actual role name used in your database.url.
grant select, insert, update, delete on table public.thoughts to ob1_user;
grant select, insert, update, delete on table public.projects to ob1_user;
grant select, insert, update, delete on table public.thought_links to ob1_user;
Store layer
Interfaces
type ThoughtStore interface {
InsertThought(ctx context.Context, thought Thought) error
GetThought(ctx context.Context, id uuid.UUID) (Thought, error)
UpdateThought(ctx context.Context, id uuid.UUID, patch ThoughtPatch) (Thought, error)
DeleteThought(ctx context.Context, id uuid.UUID) error
ArchiveThought(ctx context.Context, id uuid.UUID) error
SearchThoughts(ctx context.Context, embedding []float32, threshold float64, limit int, filter map[string]any) ([]SearchResult, error)
ListThoughts(ctx context.Context, filter ListFilter) ([]Thought, error)
Stats(ctx context.Context) (ThoughtStats, error)
}
type ProjectStore interface {
InsertProject(ctx context.Context, project Project) error
GetProject(ctx context.Context, nameOrID string) (Project, error)
ListProjects(ctx context.Context) ([]ProjectSummary, error)
TouchProject(ctx context.Context, id uuid.UUID) error
}
type LinkStore interface {
InsertLink(ctx context.Context, link ThoughtLink) error
GetLinks(ctx context.Context, thoughtID uuid.UUID) ([]ThoughtLink, error)
}
DB implementation notes
Use pgxpool.Pool.
On startup:
- parse DB config
- create pool
- register pgvector support
- ping DB
- verify required function exists
- verify vector extension exists
Domain types
type Thought struct {
ID uuid.UUID
Content string
Embedding []float32
Metadata ThoughtMetadata
ProjectID *uuid.UUID
ArchivedAt *time.Time
CreatedAt time.Time
UpdatedAt time.Time
}
type SearchResult struct {
ID uuid.UUID
Content string
Metadata ThoughtMetadata
Similarity float64
CreatedAt time.Time
}
type ListFilter struct {
Limit int
Type string
Topic string
Person string
Days int
ProjectID *uuid.UUID
IncludeArchived bool
}
type ThoughtStats struct {
TotalCount int
TypeCounts map[string]int
TopTopics []KeyCount
TopPeople []KeyCount
}
type ThoughtPatch struct {
Content *string
Metadata *ThoughtMetadata
}
type Project struct {
ID uuid.UUID
Name string
Description string
CreatedAt time.Time
LastActiveAt time.Time
}
type ProjectSummary struct {
Project
ThoughtCount int
}
type ThoughtLink struct {
FromID uuid.UUID
ToID uuid.UUID
Relation string
CreatedAt time.Time
}
Auth design
The reference OB1 implementation uses a configured access key and accepts it via header or query param.
We will keep compatibility but make it cleaner.
Auth behavior
- primary auth via header, default:
x-brain-key - optional query param fallback
- support multiple keys in config
- attach key ID to request context for auditing
Middleware flow
- read configured header
- if missing and allowed, read query param
- compare against in-memory keyring
- if matched, attach key ID to request context
- else return
401 Unauthorized
Recommendation
Set:
auth:
allow_query_param: false
for production.
MCP server design
Expose MCP over Streamable HTTP. MCP’s spec defines Streamable HTTP as the remote transport replacing the older HTTP+SSE approach.
HTTP routes
POST /mcpGET /healthzGET /readyz
Middleware stack
- request ID
- panic recovery
- structured logging
- auth
- timeout
- optional CORS
MCP tools
1. capture_thought
Input
content string
Flow
-
validate content
-
concurrently:
- call provider
Embed - call provider
ExtractMetadata
- call provider
-
normalize metadata
-
set
source = "mcp" -
insert into
thoughts -
return success payload
2. search_thoughts
Input
query stringlimit intthreshold float
Flow
- embed query
- call
match_thoughts(...) - format ranked results
- return results
3. list_thoughts
Input
limittypetopicpersondays
Flow
- build SQL filters
- query
thoughts - order by
created_at desc - return summaries
4. thought_stats
Input
- none
Flow
- count rows
- aggregate metadata usage
- return totals and top buckets
5. get_thought
Input
id string
Flow
- validate UUID
- query
thoughtsby ID - return full record or not-found error
6. update_thought
Input
id stringcontent string(optional)metadata map(optional, merged not replaced)
Flow
- validate inputs
- if content provided: re-embed and re-extract metadata
- merge metadata patch
- update row, set
updated_at - return updated record
7. delete_thought
Input
id string
Flow
- validate UUID
- hard-delete row (cascades to
thought_links) - return confirmation
8. archive_thought
Input
id string
Flow
- validate UUID
- set
archived_at = now() - return confirmation
Note: archived thoughts are excluded from search and list results by default unless include_archived: true is passed.
9. create_project
Input
name stringdescription string(optional)
Flow
- validate name uniqueness
- insert into
projects - return project record
10. list_projects
Input
- none
Flow
- query
projectsordered bylast_active_at desc - join thought counts per project
- return summaries
11. get_project_context
Input
project string(name or ID)query string(optional, semantic focus)limit int
Flow
- resolve project
- fetch recent thoughts in project (last N)
- if query provided: semantic search scoped to project
- merge and deduplicate results ranked by recency + similarity
- update
projects.last_active_at - return context block ready for injection
12. set_active_project
Input
project string(name or ID)
Flow
- resolve project
- store project ID in server session context (in-memory, per connection)
- return confirmation
13. get_active_project
Input
- none
Flow
- return current session active project or null
14. summarize_thoughts
Input
query string(optional topic focus)project string(optional)days int(optional time window)limit int
Flow
- fetch matching thoughts via search or filter
- format as context
- call AI provider to produce prose summary
- return summary text
15. recall_context
Input
query stringproject string(optional)limit int
Flow
- semantic search with optional project filter
- recency boost: merge with most recent N thoughts from project
- deduplicate and rank
- return formatted context block suitable for pasting into a new conversation
16. link_thoughts
Input
from_id stringto_id stringrelation string(e.g.follows_up,contradicts,references,blocks)
Flow
- validate both IDs exist
- insert into
thought_links - return confirmation
17. related_thoughts
Input
id stringinclude_semantic bool(default true)
Flow
- fetch explicit links from
thought_linksfor this ID - if
include_semantic: also fetch nearest semantic neighbours - merge, deduplicate, return with relation type or similarity score
Tool package plan
internal/tools/capture.go
Responsibilities:
- input validation
- parallel embed + metadata extraction
- normalization
- write to store
internal/tools/search.go
Responsibilities:
- input validation
- embed query
- vector search
- output formatting
internal/tools/list.go
Responsibilities:
- filter normalization
- DB read
- output formatting
internal/tools/stats.go
Responsibilities:
- fetch/aggregate stats
- output shaping
internal/tools/get.go
Responsibilities:
- UUID validation
- single thought retrieval
internal/tools/update.go
Responsibilities:
- partial content/metadata update
- conditional re-embed if content changed
- metadata merge
internal/tools/delete.go
Responsibilities:
- UUID validation
- hard delete
internal/tools/archive.go
Responsibilities:
- UUID validation
- set
archived_at
internal/tools/projects.go
Responsibilities:
create_project,list_projectsset_active_project,get_active_project(session context)
internal/tools/context.go
Responsibilities:
get_project_context: resolve project, combine recency + semantic search, return context block- update
last_active_aton access
internal/tools/summarize.go
Responsibilities:
- filter/search thoughts
- format as prompt context
- call AI provider for prose summary
internal/tools/recall.go
Responsibilities:
recall_context: semantic search + recency boost + project filter- output formatted context block
internal/tools/links.go
Responsibilities:
link_thoughts: validate both IDs, insert linkrelated_thoughts: fetch explicit links + optional semantic neighbours, merge and return
Startup sequence
- parse CLI args
- load config file
- apply env overrides
- validate config
- initialize logger
- create DB pool
- verify DB requirements
- create AI provider
- create store
- create tool handlers
- register MCP tools
- start HTTP server
Error handling policy
Fail fast on startup errors
- invalid config
- DB unavailable
- missing required API keys
- invalid MCP config
- provider initialization failure
Retry policy for provider calls
Retry on:
429500502503- timeout
- connection reset
Do not retry on:
- malformed request
- auth failure
- invalid model name
- invalid response shape after repeated attempts
Use:
- exponential backoff
- capped retries
- context-aware cancellation
Observability
Logging
Use log/slog in JSON mode.
Include:
- request ID
- route
- tool name
- key ID
- provider name
- DB latency
- upstream latency
- error class
Metrics
Track:
- request count by tool
- request duration
- provider call duration
- DB query duration
- auth failures
- provider failures
- insert/search counts
Health checks
/healthz
Returns OK if process is running.
/readyz
Returns OK only if:
- DB is reachable
- provider config is valid
- optional provider probe passes
Security plan
Secrets handling
- keep secrets in config files only for local/dev use
- never commit real secrets
- use mounted secret files or env overrides in production
API key policy
- support multiple keys
- identify keys by ID
- allow key rotation by config update + restart
- log only key ID, never raw value
Transport
- run behind TLS terminator in production
- disable query-param auth in production
- avoid logging full URLs when query-param auth is enabled
Testing plan
Unit tests
Config
- valid config loads
- invalid config fails
- env overrides apply correctly
Auth
- header auth success
- query auth success
- invalid key rejected
- disabled query auth rejected
Metadata
- normalization works
- invalid types default correctly
- empty metadata falls back safely
AI provider parsing
- LiteLLM embeddings parse correctly
- LiteLLM chat completions parse correctly
- provider errors classified correctly
Store
- filter builders generate expected SQL fragments
- JSONB metadata handling is stable
Integration tests
Run against local Postgres with pgvector.
Test:
- migrations apply cleanly
- insert thought
- search thought
- list thoughts with filters
- stats aggregation
- auth-protected MCP route
- LiteLLM mock/proxy compatibility
Manual acceptance tests
- start local Postgres + pgvector
- start LiteLLM
- configure LiteLLM to route embeddings to OpenRouter
- start Go server
- connect MCP client
- call
capture_thought - call
search_thoughts - call
list_thoughts - call
thought_stats - rotate API key and verify restart behavior
Milestones
Milestone 1 — foundation
Deliver:
- repo skeleton
- config loader
- config validation
- logger
- DB connection
- migrations
Exit criteria:
- app starts
- DB connection verified
- config-driven startup works
Milestone 2 — AI provider layer
Deliver:
- provider interface
- LiteLLM implementation
- OpenRouter fallback implementation
- metadata prompt
- normalization
Exit criteria:
- successful embedding call through LiteLLM
- successful metadata extraction through LiteLLM
- vector length validation works
Milestone 3 — capture and search
Deliver:
capture_thoughtsearch_thoughts- store methods for insert and vector search
Exit criteria:
- thoughts can be captured end-to-end
- semantic search returns results
Milestone 4 — remaining tools
Deliver:
list_thoughtsthought_stats
Exit criteria:
- all four tools function through MCP
Milestone 5 — extended memory and project tools
Deliver:
get_thought,update_thought,delete_thought,archive_thoughtcreate_project,list_projects,set_active_project,get_active_projectget_project_context,recall_contextsummarize_thoughtslink_thoughts,related_thoughts- migrations 003 and 004 (projects + links tables)
Exit criteria:
- thoughts can be retrieved, patched, deleted, archived
- projects can be created and listed
get_project_contextreturns a usable context blocksummarize_thoughtsproduces a prose summary via the AI provider- thought links can be created and retrieved with semantic neighbours
Milestone 6 — HTTP and auth hardening
Deliver:
- auth middleware
- health endpoints
- structured logs
- retries
- timeouts
Exit criteria:
- endpoint protected
- logs useful
- service stable under expected failures
Milestone 7 — production readiness
Deliver:
- metrics
- readiness checks
- key rotation workflow
- deployment docs
Exit criteria:
- production deployment is straightforward
- operational playbook exists
Implementation order
Build in this order:
- config
- DB + migrations (001, 002, 003, 004)
- LiteLLM client
- metadata normalization
capture_thoughtsearch_thoughts- MCP HTTP server
- auth middleware
list_thoughtsthought_statsget_thought,update_thought,delete_thought,archive_thoughtcreate_project,list_projects,set_active_project,get_active_projectget_project_context,recall_contextsummarize_thoughtslink_thoughts,related_thoughts- logs/metrics/health
This gives usable value early and builds the project/memory layer on a solid foundation.
Recommended local development stack
Services
- Postgres with pgvector
- LiteLLM proxy
- optional OpenRouter upstream
- Go service
Example shape
docker-compose:
postgres
litellm
ob1-go
Recommended production deployment
Preferred architecture
- Go service on Fly.io / Cloud Run / Render
- LiteLLM as separate service
- Postgres managed externally
- TLS terminator in front
Why not Edge Functions
The original repo uses Deno Edge Functions because of its chosen deployment environment, but the app behavior is better suited to a normal long-running Go service for maintainability and observability.
Risks and decisions
Risk: embedding dimension mismatch
Mitigation:
- validate config vs DB on startup
Risk: LiteLLM model alias drift
Mitigation:
- add readiness probe for configured models
Risk: metadata extraction instability
Mitigation:
- strong normalization + safe defaults
Risk: single global auth model
Mitigation:
- acceptable for v1
- redesign for multi-tenant later
Risk: stats scaling poorly
Mitigation:
- start with in-memory aggregation
- move to SQL aggregation if needed
Definition of done for v1
The project is done when:
- service starts from YAML config
- LiteLLM is the primary AI provider
- OpenRouter can be used behind LiteLLM
- direct OpenRouter mode still works
- MCP endpoint is authenticated
capture_thoughtstores content, embedding, metadatasearch_thoughtsperforms semantic searchlist_thoughtssupports filteringthought_statsreturns useful summaries- thoughts can be retrieved, updated, deleted, and archived
- projects can be created, listed, and used to scope captures and searches
get_project_contextreturns a ready-to-use context blockrecall_contextreturns a semantically relevant + recent context blocksummarize_thoughtsproduces prose summaries via the AI provider- thought links can be created and traversed with semantic fallback
- logs and health checks exist
- key rotation works via config + restart
Recommendation
Build this as a boring Go service:
- stdlib HTTP
- thin MCP server layer
- thin provider layer
- thin store layer
- YAML config
- explicit interfaces
Do not over-abstract it. The product shape is simple. The goal is a reliable, understandable service with LiteLLM as the stable provider boundary.
Next implementation artifact
The next concrete deliverable should be a starter Go repo skeleton containing:
go.mod- folder structure
main.go- config loader
- example config
- migration files (001–006)
- provider interface
- LiteLLM client skeleton
- store interfaces (
ThoughtStore,ProjectStore,LinkStore) - domain types including
Project,ThoughtLink,ThoughtPatch - MCP tool registration stubs for all 17 tools