Files
amcs/llm/core_tools.md
Hein 1f671dad54
Some checks failed
CI / build-and-test (push) Failing after -31m39s
feat(personas): add implementation plan for agent personas
2026-05-03 22:18:40 +02:00

264 lines
7.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# AMCS Core Tools — Implementation Plan
Tools that expose runtime context to agents and provide general-purpose capabilities: time, location, connectivity, health, data quality, credentials, background jobs, change detection, HTTP, computation, secrets, webhooks, and context management.
---
## 1. Date / Time / Timezone
**Tool:** `get_datetime`
- Return current UTC time, local server time, and timezone identifier (IANA)
- Include Unix timestamp, ISO 8601 string, day-of-week, week number
- Relative helpers: is_business_hours, start_of_day, start_of_week (relative to server TZ)
---
## 2. Location
**Tool:** `get_location`
Three scopes returned in one call:
| Scope | Data |
|-------|------|
| Server | hostname, OS, datacenter/cloud region if detectable |
| Project | active project name, working directory, git repo root, current branch, last commit hash+message |
| User | IP-derived city/country (GeoIP), configurable home location override in config |
---
## 3. Connectivity & Health
**Tool:** `get_health`
- AMCS server version and uptime
- Database connection status and round-trip latency (ms)
- pgvector extension present and version
- LiteLLM / OpenRouter reachability (HEAD ping, latency, last-ok timestamp)
- Configured integration endpoints (n8n, Telegram, etc.) — status and last-ok timestamp
- Overall status: `ok` / `degraded` / `down`
---
## 4. Data Quality & Maintenance
**Tool:** `get_data_quality`
- Total thoughts, plans, learnings, projects
- Thoughts missing embeddings (count + oldest)
- Failed metadata reparse queue depth
- Orphaned records: thoughts with no project, broken thought links, plans with missing dependencies
- Last embedding backfill run timestamp and result
---
## 5. Security & Credentials
**Tool:** `get_credential_status`
- Per-configured integration: name, valid/expired boolean, days until expiry (no values exposed)
- Last successful authentication timestamp per integration
- AMCS token expiry if applicable
---
## 6. Background Job Status
**Tool:** `get_job_status`
- List of known async job types (embedding backfill, metadata reparse, file indexing)
- Per job: status (idle/running/failed), last run timestamp, last run duration, error if failed
- Pending queue depth per job type
---
## 7. Change Detection
**Tool:** `get_changes`
Parameters:
- `since` — ISO 8601 timestamp or relative shorthand (`1h`, `24h`, `7d`)
- `scope``all` | `thoughts` | `plans` | `learnings` | `projects`
Returns:
- Counts of created, updated, deleted per entity type since the given time
- Top 5 most recently modified items per type (id, title, updated_at)
- Active project changes highlighted separately
---
---
## 8. HTTP Client
**Tool:** `http_request`
Parameters:
- `method` — GET | POST | PUT | PATCH | DELETE
- `url` — target URL
- `headers` — optional map of header key/value pairs
- `body` — optional string body (JSON, form, etc.)
- `timeout_seconds` — default 30
Returns: status code, response headers, body (truncated at configurable limit), latency (ms)
Restrictions:
- Blocklist for private/internal IP ranges (configurable allow-override)
- Max response body size configurable (`http_client.max_body_bytes`)
- Requires `tools.http_client: true` in config to enable
---
## 9. Token Counter
**Tool:** `count_tokens`
Parameters:
- `text` — string to count
- `model` — optional model name to select tokenizer (default: configured primary model)
Returns: token count, estimated cost (input only), tokenizer used
- Uses tiktoken-compatible tokenizer; falls back to char/4 estimate if model unknown
---
## 10. Text Diff
**Tool:** `diff_text`
Parameters:
- `a` — original text
- `b` — revised text
- `format``unified` | `side_by_side` | `summary` (default: `summary`)
Returns: diff output and change stats (lines added, removed, unchanged)
---
## 11. Data Parser
**Tool:** `parse_data`
Parameters:
- `input` — raw string
- `format``json` | `csv` | `xml` | `toml` | `yaml`
- `query` — optional JSONPath / XPath / key path to extract a subset
Returns: parsed structure as JSON, or extracted value if query provided
---
## 12. Template Renderer
**Tool:** `render_template`
Parameters:
- `template` — Go `text/template` or Mustache string
- `vars` — key/value map of variables to inject
Returns: rendered string
Use case: build dynamic prompts or messages from stored templates and runtime context.
---
## 13. Secret Retrieval
**Tool:** `get_secret`
Parameters:
- `name` — secret name as configured in AMCS secrets store
Returns: secret value (transmitted but never logged by server)
- Secrets defined in config or environment under `secrets.*`
- Access requires authenticated token with `secrets:read` scope
- Audit log entry written on every retrieval (name, caller, timestamp — not value)
---
## 14. Webhook Trigger
**Tool:** `trigger_webhook`
Parameters:
- `name` — named webhook as configured in `webhooks.*`
- `payload` — optional JSON body to merge with default payload
Returns: HTTP status from target, latency (ms)
- Named webhooks defined in config (URL + default headers/auth) — agents never see the raw URL or credentials
- Supports n8n, Zapier, generic HTTP targets
---
## 15. Summarize / Compress
**Tool:** `summarize`
Parameters:
- `text` — content to summarize
- `max_tokens` — target output length in tokens (default: 200)
- `style``bullets` | `paragraph` | `headline` (default: `paragraph`)
- `focus` — optional hint string ("focus on action items", "focus on errors")
Returns: summarized text, input token count, output token count
- Calls the configured LLM; billed against the AMCS service account
- Useful for compressing large file loads or long thought chains before embedding
---
## 16. Chunk Text
**Tool:** `chunk_text`
Parameters:
- `text` — content to split
- `chunk_size` — max tokens per chunk (default: 512)
- `overlap` — token overlap between chunks (default: 50)
- `model` — tokenizer model (default: configured primary)
Returns: array of chunk strings with index and token count per chunk
---
## 17. Schedule Action
**Tool:** `schedule_action`
Parameters:
- `run_at` — ISO 8601 datetime or cron expression
- `tool` — tool name to invoke
- `args` — arguments for the tool
- `label` — optional human-readable label
Returns: scheduled action ID
**Tool:** `list_scheduled_actions`
- Returns pending actions: id, label, tool, run_at, status
**Tool:** `cancel_scheduled_action`
- Parameter: `id`
---
## Implementation Notes
**Context & read-only tools (17):**
- All read-only, no write permissions required
- GeoIP opt-in via config (`core_tools.geoip: true`), offline MaxMind GeoLite2 DB
- Credential expiry checks never log or return secret values — metadata only
- `get_health` callable unauthenticated (mirrors existing `/health` endpoint)
- `get_changes` is query-heavy — 30s cache TTL to avoid repeated aggregation
**Agent utility tools (817):**
- `http_request` disabled by default; requires `tools.http_client: true` in config; private IP ranges blocked
- `get_secret` requires `secrets:read` token scope; every retrieval audit-logged (name + caller only)
- `trigger_webhook` never exposes raw URLs or credentials to agents — named config only
- `summarize` and `chunk_text` call the configured LLM; usage billed to AMCS service account
- `schedule_action` backed by the existing cron/trigger infrastructure
- All tools registered under the `core` category in `describe_tools`