# agntz — full documentation corpus

Generated from /docs. Every section below corresponds to a page on the site;
the heading hierarchy is preserved verbatim.

For per-page raw markdown, append `.md` to any docs URL
(e.g. `/docs/quickstart.md`).


<!-- ============================================================== -->
<!-- Get started -->
<!-- ============================================================== -->

<!-- source: /docs -->

# Introduction

**agntz** is an open-source agent framework where agents are declared as YAML — not code — and run unchanged in three places: embedded in your app (`@agntz/sdk` for TypeScript, `agntz` for Python), on the hosted cloud (`agntz.co`), or on infrastructure you control (self-host). Every run is traced. Every save is a version. Bring your own model keys.

These docs are optimized for both humans and LLMs. Every page is also available as raw markdown — see the **Copy** button at the top of each page, or fetch [/llms.txt](/llms.txt) for the full corpus.

## What you can build

- **Single-call agents** — an LLM with an instruction, optional tools, optional structured output.
- **Pipelines** — sequential and parallel agents that compose other agents into multi-step workflows with loops and conditionals.
- **Tool agents** — deterministic function calls with no LLM in the loop.
- **Long-running conversations** — sessions persist message history across calls.
- **Streaming UIs** — full event stream (tokens, tool calls, replies) over Server-Sent Events.
- **Multi-tenant products** — every record is user-scoped on the hosted edition.

Three things stay the same as you scale from your laptop to production:

1. **The YAML schema.** One `manifest.yaml` runs in embedded mode, hosted mode, and self-hosted mode.
2. **The client API.** `client.agents.run(...)` — the same resource shape in TypeScript and Python, with language-native argument names.
3. **The observability model.** Runs, spans, and traces work identically in every edition.

## Choose your starting point

| If you want to… | Use | Read |
|---|---|---|
| Run an agent on your laptop in 60 seconds | `@agntz/sdk` or `agntz` | [Quickstart](/docs/quickstart) |
| Build agents from the terminal | `agntz` CLI | [CLI getting started](/docs/cli-quickstart) |
| Author and run agents in a hosted UI | agntz.co | [Hosted cloud](/docs/deploy/hosted-cloud) |
| Call hosted agents from your backend | `@agntz/client` or `AgntzClient` | [Hosted client](/docs/sdk-cli/client) |
| Deploy your own hosted stack | Docker / Vercel + Railway | [Self-host](/docs/deploy/self-host-production) |

## Install

```bash {group=intro-install select=ts}
# Embedded: run agents in-process from YAML files
pnpm add @agntz/sdk

# Hosted client: call agents on agntz.co or your own worker
pnpm add @agntz/client

# Optional persistence for embedded mode
pnpm add @agntz/store-sqlite

# CLI (run via npx or install globally)
npm i -g @agntz/sdk
```

```bash {group=intro-install select=python}
# Embedded local SDK + hosted client
pip install agntz

# Local model execution through LiteLLM
pip install "agntz[litellm]"
```

Node 20+ for TypeScript. Python 3.11+ for Python. `@agntz/client` is universal across browser, Node, and edge runtimes; embedded SDKs read YAML from disk and run in your process.

Set the provider API key your agents will use:

```bash
export OPENAI_API_KEY=sk-...
# or ANTHROPIC_API_KEY=sk-ant-...
# or GOOGLE_GENERATIVE_AI_API_KEY=...
# or OPENROUTER_API_KEY=sk-or-...   # 300+ models incl. open-source via one key
```

agntz calls providers directly with your key — no proxy, no data routing. **OpenRouter** is available as a meta-provider when you want access to many models (Anthropic, Google, Meta, DeepSeek, open-source) with a single API key — use `provider: openrouter` and a slug like `anthropic/claude-sonnet-4` or `meta-llama/llama-3.3-70b-instruct`.

## Where to go next

- **New here?** Start with the [Quickstart](/docs/quickstart).
- **Prefer the terminal?** Jump to [CLI getting started](/docs/cli-quickstart).
- **Want the big picture?** Read [Defining agents](/docs/concepts/agents) and [The four agent kinds](/docs/concepts/agent-kinds).
- **Looking for a specific field?** The [Schema](/docs/schema/common-fields) section is the complete reference.


<!-- source: /docs/quickstart -->

# Quickstart

The fastest path: write a YAML file, point an Agntz SDK at the directory, call it. No server, no signup, no infrastructure. The YAML is shared between TypeScript and Python; the client code follows each language's conventions.

## Install

```bash {group=quickstart-install select=ts}
pnpm add @agntz/sdk
export ANTHROPIC_API_KEY=sk-ant-...     # or OPENAI_API_KEY, OPENROUTER_API_KEY, etc.
```

```bash {group=quickstart-install select=python}
pip install "agntz[litellm]"
export ANTHROPIC_API_KEY=sk-ant-...     # or OPENAI_API_KEY, OPENROUTER_API_KEY, etc.
```

See [Models & providers](/docs/models) for the full list of supported providers.

## 1. Create an agent

```yaml [agents/support.yaml]
id: support
kind: llm
model:
  provider: anthropic
  name: claude-sonnet-4-6
instruction: |
  You are a friendly customer support agent. Answer concisely.

  {{userQuery}}
```

The agent's `id` is how you'll address it from code. `kind: llm` means a single model call. With no `inputSchema`, the agent takes a plain string, accessible in templates as `{{userQuery}}`.

## 2. Run it

```ts [index.ts] {group=quickstart-run}
import { agntz } from "@agntz/sdk";

const client = await agntz({ agents: "./agents" });

const result = await client.agents.run({
  agentId: "support",
  input: "How do I reset my password?",
});

console.log(result.output);
```

```python [main.py] {group=quickstart-run}
from agntz import LiteLLMModelProvider, agntz

client = agntz(
    agents="./agents",
    model_provider=LiteLLMModelProvider(),
)

result = client.agents.run(
    agent_id="support",
    input="How do I reset my password?",
)

print(result.output)
```

```bash {group=quickstart-command select=ts}
node --experimental-strip-types index.ts
```

```bash {group=quickstart-command select=python}
python main.py
```

That's it. The SDK parses every `.yaml` file under `./agents`, validates it against the schema, registers it with the runtime, and exposes the same `client.agents.run`, `client.runs.list`, and `client.traces.get` surface as the hosted client.

## 3. Stream or inspect

```ts {group=quickstart-stream}
for await (const event of client.agents.stream({
  agentId: "support",
  input: "Walk me through password reset",
})) {
  if (event.type === "text-delta") process.stdout.write(event.text);
  if (event.type === "complete") console.log("\n— done");
}
```

```python {group=quickstart-stream}
for event in client.agents.stream(
    agent_id="support",
    input="Walk me through password reset",
):
    if event.type == "complete":
        print(event.output)
```

TypeScript local runs expose token deltas for LLM streaming today. Python local runs expose start and complete events in this first slice; the hosted Python client streams the worker's SSE events.

## 4. Use the same call against the hosted cloud later

When you outgrow embedded mode — durable run history, multi-user isolation, agent management UI — switch constructors and keep the same resource shape:

```diff {group=quickstart-hosted}
- import { agntz } from "@agntz/sdk";
+ import { AgntzClient } from "@agntz/client";

- const client = await agntz({ agents: "./agents" });
+ const client = new AgntzClient({
+   apiKey: process.env.AGNTZ_API_KEY!,
+   baseUrl: "https://api.agntz.co",
+ });
```

```python {group=quickstart-hosted}
import os
from agntz import AgntzClient

client = AgntzClient(
    api_key=os.environ["AGNTZ_API_KEY"],
    base_url="https://api.agntz.co",
)
```

The `agents.run`, `runs.list`, and `traces.get` calls work across local and hosted clients. YAML manifests move to the hosted registry; in-process local tools become MCP servers or HTTP endpoints.

## Next steps

- **Add structured I/O.** Declare an [`inputSchema` and `outputSchema`](/docs/schema/input-state-output) to type-check the agent's contract.
- **Add tools.** Wire up [HTTP](/docs/tools/http), [MCP](/docs/tools/mcp), or [local](/docs/tools/local) tools.
- **Chain agents.** Compose multi-step workflows with [sequential and parallel pipelines](/docs/concepts/agent-kinds).
- **Persist sessions.** Use SQLite for [durable conversation history](/docs/concepts/sessions).


<!-- source: /docs/cli-quickstart -->

# CLI getting started

Use the `agntz` CLI when you want to create a YAML agent, edit it in your repo, and run it locally from the terminal. This is the fastest path for a human or coding agent to add an agent to an existing codebase.

The first workflow is local. Hosted cloud comes later.

## Install

```bash
# Run on demand
npx @agntz/sdk --help

# Or install the agntz executable globally
npm i -g @agntz/sdk
agntz --help
```

The CLI is published by the `@agntz/sdk` package. The executable name is `agntz`.

## 1. Create an agent YAML

```bash
mkdir -p agents
agntz create "Answer customer support questions in a concise, practical tone." -o ./agents/support.yaml
```

`create` calls the hosted agent-builder and writes a portable YAML manifest. It does not require login.

After generation, inspect the file:

```bash
sed -n '1,220p' ./agents/support.yaml
```

The important fields are:

| Field | Why it matters |
|---|---|
| `id` | The name used by the CLI, SDK, and hosted client. |
| `kind` | The agent shape, such as `llm`, `tool`, `sequential`, or `parallel`. |
| `model` | The provider and model used for local LLM calls. |
| `instruction` / `prompt` | The behavior and input template. |
| `tools` / `resources` | Runtime capabilities the agent expects. |

## 2. Edit or iterate

You can edit YAML directly, or ask the builder to revise the existing manifest:

```bash
agntz create "Revise this support agent so it asks one clarifying question when the request is ambiguous." \
  --current-manifest ./agents/support.yaml \
  -o ./agents/support.yaml
```

Use direct YAML edits for exact IDs, model changes, prompts, schemas, and tool wiring. Use `--current-manifest` when you want a generated structural change.

## 3. Run locally

Set the provider key required by the manifest's `model.provider`, then run the YAML file:

```bash
export OPENAI_API_KEY=sk-...
agntz run ./agents/support.yaml --input "How do I reset my password?"
```

The CLI treats a target as local when it is a file path, starts with `./`, contains a slash, or ends in `.yaml` / `.yml`.

Useful local run variants:

```bash
# Stream events
agntz run ./agents/support.yaml --input "Walk me through password reset" --stream

# Pipe stdin
printf "Summarize this support ticket" | agntz run ./agents/support.yaml

# Keep a conversation session
agntz run ./agents/support.yaml --session local-user-42 --input "My email is wrong"
agntz run ./agents/support.yaml --session local-user-42 --input "What did I just tell you?"

# Run a directory only when it contains one manifest
agntz run ./agents --input "Hello"
```

Input precedence is `--input`, then trailing positional text, then piped stdin, then an empty string.

## 4. Call the agent from your service

Use the CLI to create and smoke-test the YAML. Use `@agntz/sdk` from service code when the agent needs local tools, resource providers, durable stores, or app-specific runtime context.

```bash
pnpm add @agntz/sdk
```

```ts [index.ts]
import { agntz, tool, z } from "@agntz/sdk";

const client = await agntz({
  agents: "./agents",
  tools: [
    tool({
      name: "lookup_order",
      description: "Look up an order by ID",
      input: z.object({ orderId: z.string() }),
      execute: async ({ orderId }) => {
        return { orderId, status: "shipped" };
      },
    }),
  ],
});

const result = await client.agents.run({
  agentId: "support",
  input: { userQuery: "Where is order 123?" },
  sessionId: "user-42",
});

console.log(result.output);
```

The terminal CLI can load local YAML and run HTTP/MCP/LLM-only agents. It cannot register arbitrary in-repo local tool handlers by itself; those handlers live in `agntz({ tools: [...] })` in your application code.

## 5. Optional hosted invocation

When you have an agent saved in hosted agntz, log in and run by id:

```bash
agntz login --key ar_live_...
agntz run support --input "Hello from the hosted runtime"
```

A bare target like `support` is treated as hosted. Force hosted mode with `--remote`; force local mode with `--local`.

Hosted service code uses `@agntz/client`:

```ts
import { AgntzClient } from "@agntz/client";

const client = new AgntzClient({
  apiKey: process.env.AGNTZ_API_KEY!,
  baseUrl: "https://api.agntz.co",
});

const result = await client.agents.run({
  agentId: "support",
  input: "Hello",
});
```

## LLM operator recipe

If you are asking Claude Code, Codex, or another coding agent to use agntz in a repo, give it this sequence:

```text
Use agntz locally first.
1. Check whether this repo already has an agents/ directory.
2. Install or invoke the CLI from @agntz/sdk.
3. Create or update ./agents/<agent-id>.yaml with agntz create.
4. Inspect the YAML and make direct edits for ids, prompts, schemas, models, tools, and resources.
5. Run the YAML with agntz run ./agents/<agent-id>.yaml --input "...".
6. If the agent needs local code tools or resource providers, add @agntz/sdk service code and pass tools/resources to agntz(...).
7. Treat hosted login and hosted run management as optional follow-up work.
```

## Current CLI boundary

The current CLI supports `create`, `run`, `login`, `logout`, `whoami`, `runs`, and `traces`.

It does not currently provide project scaffolding, eval execution, validation-only execution, an interactive playground, or a Studio launcher. If an older README mentions commands such as `init`, `invoke`, `validate`, `eval`, or `playground`, prefer this page and the [CLI reference](/docs/sdk-cli/cli).

## Next steps

- **[CLI reference](/docs/sdk-cli/cli)** — every command and flag.
- **[Embedded SDK](/docs/sdk-cli/sdk)** — run agents from TypeScript or Python service code.
- **[Defining agents](/docs/concepts/agents)** — understand and edit the generated YAML.
- **[Local tools](/docs/tools/local)** — wire in-process tool handlers from your service.


<!-- ============================================================== -->
<!-- Concepts -->
<!-- ============================================================== -->

<!-- source: /docs/concepts/agents -->

# Defining agents

Agents are declared in YAML manifests. The file's `id` is the agent's identifier; `kind` selects one of four agent types. The same manifest runs unchanged in embedded mode, hosted mode, and self-hosted mode.

## Anatomy of a manifest

```yaml [agents/sentiment-analyzer.yaml]
id: sentiment-analyzer            # required, unique within the registry
name: Sentiment Analyzer          # optional, display label
description: Tags text positive/negative/neutral
kind: llm                         # llm | tool | sequential | parallel

inputSchema:                      # optional — what the agent expects
  text: string

model:                            # required for kind: llm
  provider: openai
  name: gpt-5.4-nano
  temperature: 0

instruction: |                    # required for kind: llm — the system prompt
  Analyze the sentiment of the following text and respond with a JSON object.

  Text: {{text}}

outputSchema:                     # optional — what the model must return
  sentiment:
    type: string
    enum: [positive, negative, neutral]
  confidence: number
```

A manifest is just data — there's no code to maintain alongside it. The runner validates it on load, registers it with the runtime, and exposes it through the same `client.agents.run` API regardless of where it runs.

## Read next

- **[The four agent kinds](/docs/concepts/agent-kinds)** — `llm`, `tool`, `sequential`, `parallel`, with examples of each.
- **[Common fields](/docs/schema/common-fields)** — `id`, `name`, `kind`, and the fields every agent shares.
- **[Input, state, and output](/docs/schema/input-state-output)** — how data flows into and out of an agent.
- **[Templates and conditions](/docs/schema/templates-conditions)** — the `{{}}` mini-language used in instructions, params, and `when`/`until`.


<!-- source: /docs/concepts/agent-kinds -->

# The four agent kinds

Every manifest sets `kind` to one of four values. **Primitives** (`llm`, `tool`) do the actual work; **pipelines** (`sequential`, `parallel`) compose primitives and other pipelines into multi-step workflows.

## llm — call a model

The basic case. An LLM agent calls one language model with an instruction, optional tools, and optional structured output.

```yaml [agents/chatbot.yaml]
id: chatbot
name: Chatbot
description: A simple conversational assistant
kind: llm

model:
  provider: openai
  name: gpt-5.4-mini
  temperature: 0.7

instruction: |
  You are a friendly, helpful assistant. Answer the user's question clearly and concisely.

  {{userQuery}}
```

With no `inputSchema`, the agent takes a plain string accessible as `{{userQuery}}`. Add `inputSchema` to type the input; add `outputSchema` to constrain the response shape. See [Input, state, and output](/docs/schema/input-state-output).

## tool — deterministic, no model

A tool agent maps state values to a single tool call. No LLM, no reasoning — just a direct function or API call wrapped in the same observability model.

```yaml [agents/send-email.yaml]
id: send-email
kind: tool

inputSchema:
  recipientEmail: string
  emailSubject: string
  emailBody: string

tool:
  kind: mcp
  server: https://email-api.example.com/mcp
  name: send_email
  params:
    to: "{{recipientEmail}}"
    subject: "{{emailSubject}}"
    body: "{{emailBody}}"
```

Use this when you want the predictability of a hard-coded call inside a larger pipeline — for example, a "notify Slack" step at the end of an article workflow.

## sequential — run steps in order

Sequential agents run `steps` one after another. Each step's output is added to state under its `id` (or its `stateKey`) and becomes available to downstream steps as `{{stepId.property}}`.

```yaml [agents/research-and-summarize.yaml]
id: research-and-summarize
kind: sequential

inputSchema:
  userQuery: string

steps:
  - ref: researcher
    input:
      query: "{{userQuery}}"

  - agent:
      id: summarizer
      kind: llm
      model: { provider: openai, name: gpt-5.4 }
      instruction: |
        Summarize this research: {{researcher}}
      outputSchema:
        summary: string

output:
  summary: "{{summarizer.summary}}"
  sourceResearch: "{{researcher}}"
```

Use `ref:` to point at an existing agent id; use `agent:` to inline one. Both have full access to the same state object.

### Looping

Add `until` to make a sequential agent loop. `maxIterations` is the safety stop.

```yaml
id: write-review-loop
kind: sequential

until: "{{reviewer.approved}} == true"
maxIterations: 5

steps:
  - ref: writer
    input:
      topic: "{{topic}}"
      feedback: "{{reviewer.feedback}}"   # null on first iteration
  - ref: reviewer
    input:
      draft: "{{writer.draft}}"
```

See [Pipeline steps and looping](/docs/schema/pipeline-steps) for the full reference on `when`, `until`, and `stateKey`.

## parallel — run branches simultaneously

Parallel agents run `branches` at the same time and merge their outputs into state.

```yaml
id: text-analysis
kind: parallel

inputSchema:
  text: string

branches:
  - ref: sentimentAnalyzer
    input: { text: "{{text}}" }
  - ref: entityExtractor
    input: { text: "{{text}}" }
```

If you don't declare an `output:`, the result is the merged state — `{ sentimentAnalyzer, entityExtractor }`.

## Putting it together

Pipelines nest. A single manifest can mix parallel research, a write/review loop, and a tool-call notification:

```yaml [agents/article-pipeline.yaml]
id: article-pipeline
kind: sequential

inputSchema:
  topic: string
  tone:
    type: string
    default: professional

steps:
  # Step 1: research in parallel
  - agent:
      id: research-phase
      kind: parallel
      stateKey: research
      branches:
        - ref: web-researcher
          input: { query: "{{topic}}" }
        - ref: academic-researcher
          input: { query: "{{topic}}" }

  # Step 2: write + review until approved
  - agent:
      id: write-review
      kind: sequential
      stateKey: writing
      until: "{{editor.approved}} == true"
      maxIterations: 3
      steps:
        - ref: writer
          input:
            topic: "{{topic}}"
            tone: "{{tone}}"
            webResearch: "{{research.webResearcher}}"
            academicResearch: "{{research.academicResearcher}}"
            feedback: "{{editor.feedback}}"
        - ref: editor
          input: { draft: "{{writer.draft}}" }

  # Step 3: notify
  - agent:
      id: notify
      kind: tool
      tool:
        kind: local
        name: send_slack
        params:
          channel: "#content"
          message: "New article ready: {{topic}}"

output:
  article: "{{writing.writer.draft}}"
  review: "{{writing.editor}}"
```

The trace for this run shows nested spans for the parallel research phase, every loop iteration of the write/review step, and the final tool call. See [Runs and traces](/docs/concepts/runs-and-traces).


<!-- source: /docs/concepts/sessions -->

# Sessions

A **session** persists conversation history between an agent and its caller. Pass the same session id across runs and the runtime auto-loads prior messages, appends the new exchange, and forwards the transcript to the model.

## Calling with a session id

```ts {group=sessions-run}
await client.agents.run({ agentId: "support", input: "Hi", sessionId: "user-42" });
await client.agents.run({ agentId: "support", input: "follow-up", sessionId: "user-42" });
```

```python {group=sessions-run}
client.agents.run(agent_id="support", input="Hi", session_id="user-42")
client.agents.run(agent_id="support", input="follow-up", session_id="user-42")
```

The two calls share history. The second call sees the first turn in the model's context window automatically — no manual history management required.

Sessions are agent-scoped: `(agentId, sessionId)` in TypeScript and `(agent_id, session_id)` in Python. Two agents with the same session id keep independent histories.

## Storage

### Embedded

Sessions live **in memory** by default. They survive within the process but are lost on restart.

```ts {group=sessions-store}
import { agntz } from "@agntz/sdk";
import { sqliteStore } from "@agntz/sdk/sqlite";

const client = await agntz({
  agents: "./agents",
  store: sqliteStore("./agntz.db"),
});
```

```python {group=sessions-store}
from agntz import LiteLLMModelProvider, SQLiteStore, agntz

client = agntz(
    agents="./agents",
    store=SQLiteStore("./agntz.db"),
    model_provider=LiteLLMModelProvider(),
)
```

The same store also backs runs and traces, so durability extends across the local SDK surface.

### Hosted

Sessions are stored in Postgres and scoped to the authenticated user. They survive restarts, redeploys, and SDK reconnects. No configuration needed — pass any session id string you want.

## Reading local session messages

```ts {group=sessions-read}
const store = sqliteStore("./agntz.db");
const client = await agntz({ agents: "./agents", store });

const messages = await store.getMessages("user-42");
```

```python {group=sessions-read}
messages = client.sessions.get_messages("user-42")
for message in messages:
    print(message.role, message.content)
```

## What's in a session

A session record holds:

- **Messages** — every user input and assistant output for this session.
- **Reply events** — intermediate messages emitted via the `reply` tool where supported.
- **Last run reference** — hosted storage can use the most recent run id for fast trace lookup from a session.

Sessions do not persist agent state such as `{{stepId.property}}`. State is per-run and discarded once the run ends; messages are the durable surface.

## Patterns

- **One session per user.** Pass the user's stable id and let the runtime track every interaction.
- **One session per topic.** Mint ids such as `user-42:billing` so the same user can have multiple parallel conversations.
- **Anonymous trials.** Generate a session id client-side and pass it through until the user signs up; then re-key sessions to the new user id at signup time.


<!-- source: /docs/concepts/context-and-resources -->

# Context and resources

agntz has two different "context" surfaces. They solve different problems and should not be used interchangeably.

| Surface | What it means | Who can set it | What sees it |
|---|---|---|---|
| `context` | Runtime namespace grants for resources such as memory, RAG, and files | Trusted application or worker code | Resource providers and tool context |
| `contextIds` | Legacy scratchpad bucket ids backed by `ContextStore` | Application code invoking the runner | The prompt, as injected scratchpad text |
| Session | Conversation message history | Runtime and client through `sessionId` | The model conversation |
| State | One pipeline invocation's working object | Manifest execution | Pipeline steps and templates |
| Run | One invocation record with status and events | Runtime | Runs API and traces |

Use `context` for access control boundaries. Use `contextIds` only when you explicitly want the older shared scratchpad behavior.

## Namespace grants

A namespace grant is a plain path string that says which branch of a resource tree this run may access.

```ts {group=context-run}
await client.agents.run({
  agentId: "support-with-memory",
  input: "Remember that I prefer metric units.",
  context: ["app/user/" + userId],
});
```

```python {group=context-run}
client.agents.run(
    agent_id="support-with-memory",
    input="Remember that I prefer metric units.",
    context=[f"app/user/{user_id}"],
)
```

Grant strings are intentionally strict:

- No leading or trailing slash.
- No empty path segments.
- No `.` or `..` traversal segments.
- No wildcards.
- No whitespace in any segment.
- Duplicates are removed after validation.

The model should never be asked to choose a namespace. Trusted code mints grants from authenticated facts such as user id, workspace id, tenant id, or service identity. Resource providers receive normalized grants through `ResourceToolContext.grants`.

## Narrow-only propagation

Child invocations inherit the parent's grants unless trusted code requests a narrower descendant grant.

```ts
await ctx.invoke("account-helper", "Check invoice details", {
  context: ["app/user/" + userId + "/billing"],
});
```

That is allowed only when the parent already has `app/user/&lt;id&gt;` or another ancestor of the requested child grant. A child cannot widen from `app/user/u_123/billing` to `app/user/u_123`, and it cannot jump sideways to `app/user/u_456`.

## Resources

Resources are named runtime capabilities declared in an LLM agent manifest. A declaration says which provider kind the agent wants, whether it needs read or write tools, and any provider-specific config.

```yaml
id: support-with-memory
kind: llm
model:
  provider: openai
  name: gpt-5.4
instruction: |
  Help the user. Use memory only when it is relevant.
resources:
  memory:
    mode: read-write
    autoScan: true
```

The runner looks up a provider by resource `kind`. If `kind` is omitted, it defaults to the resource name. Provider tools are exposed to the model with names like `memory_read` and `memory_write`.

The resource declaration does not grant access by itself. Access comes from the run's `context` grants.

## Resource provider lifecycle

For each run, the runtime:

1. Validates and normalizes `context` namespace grants.
2. Resolves the agent's `resources:` declarations against registered providers.
3. Lets providers inject extra context, such as a list of visible memory topics.
4. Registers provider tools for the model with generated names.
5. Passes `ResourceToolContext` to each provider tool call.

Resource providers must still validate every read and write against the grants they receive. Namespace paths are capabilities, not suggestions.

## Read versus read-write

Resources support two modes:

| Mode | Behavior |
|---|---|
| `read` | The model receives only read-safe provider tools. |
| `read-write` | The model may receive read and write provider tools. |

If a parent invocation runs a resource in `read` mode, child invocations cannot widen it back to `read-write`. This keeps delegated work inside the parent's access boundary.

## Legacy scratchpad context

`contextIds` are the older shared scratchpad API. When you pass them, the runner loads text entries from `ContextStore` and injects them into the prompt.

```ts
await runner.invoke("researcher", "Find docs about MCP", {
  contextIds: ["project-alpha"],
});
```

If the agent has `contextWrite: true`, its final output is written back to each context bucket. This is useful for simple multi-agent scratchpads, but it is not a security boundary and it is not how resource access is granted.

## Where to go next

- **[Resources schema](/docs/schema/resources)** - every field in the `resources:` block.
- **[Memory with memrez](/docs/tools/memory-memrez)** - durable memory as the first resource provider.
- **[Sessions](/docs/concepts/sessions)** - conversation history across calls.


<!-- source: /docs/concepts/runs-and-traces -->

# Runs and traces

Every invocation produces a **Run** (the top-level execution record) and a **Trace** (the span tree below it).

A trace's spans cover three kinds of work:

- `agent.invoke` or `run` — the root span for an agent run.
- `model.call` or `model` — each LLM API call.
- `tool.execute` or `tool` — each tool execution.

Spans nest. A sequential pipeline's trace looks like:

```
agent.invoke article-pipeline
├── agent.invoke research-phase   (parallel)
│   ├── agent.invoke web-researcher
│   │   └── model.call gpt-5.4
│   └── agent.invoke academic-researcher
│       └── model.call gpt-5.4
└── agent.invoke write-review
    ├── agent.invoke writer
    │   └── model.call claude-sonnet-4-6
    └── agent.invoke editor
        └── model.call gpt-5.4-mini
```

## Listing and inspecting

```ts {group=runs-list}
const { rows } = await client.runs.list({
  agentId: "support-agent",
  status: "error",
  limit: 50,
});

const trace = await client.traces.get(rows[0].id);
for (const span of trace.spans) {
  console.log(span.kind, span.name, span.durationMs, span.status);
}
```

```python {group=runs-list}
runs = client.runs.list(
    agent_id="support-agent",
    status="completed",
)

trace_rows = client.traces.list(agent_id="support-agent")
trace = client.traces.get(trace_rows["rows"][0]["traceId"])
for span in trace["spans"]:
    print(span["kind"], span["name"], span["durationMs"], span["status"])
```

The resource shape is intentionally similar across local and hosted clients. TypeScript uses camelCase option names; Python uses snake_case.

## Live trace streams

```ts {group=runs-stream}
for await (const event of client.traces.stream(runId)) {
  if (event.type === "span-start") console.log("→", event.span.name);
  if (event.type === "span-end") console.log("←", event.span.name, event.span.durationMs);
}
```

```python {group=runs-stream}
for event in client.traces.stream(trace_id):
    if event.type == "snapshot":
        print(event.summary)
```

The hosted Python client streams worker SSE events. The local Python SDK currently exposes trace snapshots rather than token-level span updates.

## Storage

### Embedded

Runs and traces live in memory by default. For durable storage, use SQLite:

```ts {group=runs-store}
import { agntz } from "@agntz/sdk";
import { sqliteStore } from "@agntz/sdk/sqlite";

const client = await agntz({
  agents: "./agents",
  store: sqliteStore("./agntz.db"),
});
```

```python {group=runs-store}
from agntz import LiteLLMModelProvider, SQLiteStore, agntz

client = agntz(
    agents="./agents",
    store=SQLiteStore("./agntz.db"),
    model_provider=LiteLLMModelProvider(),
)
```

The same store backs sessions, messages, runs, and trace spans.

### Hosted

Runs and traces are written to Postgres, scoped to the authenticated user. No eviction.

## OpenTelemetry

TypeScript embedded runs can pipe spans into an existing observability stack:

```ts
import { trace } from "@opentelemetry/api";
import { createRunner } from "@agntz/sdk";

const runner = createRunner({
  telemetry: {
    tracer: trace.getTracer("my-app"),
    recordIO: false,
    recordToolIO: false,
    baseAttributes: {
      "service.name": "my-app",
      "deployment.environment": "production",
    },
  },
});
```

Python local trace spans are stored through the configured Agntz store in this first package slice. OpenTelemetry export can be added on top of that store protocol later.

## Cancellation

Hosted and TypeScript long-running runs are cancellable:

```ts {group=runs-cancel}
const run = await client.runs.start({ agentId: "long-job", input: {} });
await client.runs.cancel(run.id);
```

```python {group=runs-cancel}
run = client.runs.start(agent_id="long-job", input={})
client.runs.cancel(run.id)
```

Cancellation is best-effort: in-flight model calls finish, but no further steps execute and cancellation propagates through nested pipelines.


<!-- ============================================================== -->
<!-- Schema -->
<!-- ============================================================== -->

<!-- source: /docs/schema/common-fields -->

# Common fields

Every agent manifest starts with the same four-field header, regardless of kind. These are the identity fields surfaced everywhere — in the trace, the runs list, the agent picker, and (on the hosted edition) the version history.

```yaml
id: my-agent                          # required, unique within the registry
name: My Agent                        # optional, display label
description: Does a thing             # optional, surfaced in UIs
kind: llm                             # llm | tool | sequential | parallel
```

## Field reference

### `id` *(required)*

The agent's stable identifier. It's what you pass to `client.agents.run({ agentId })`, what appears in trace spans, and what other agents reference with `ref:`.

- Must match `^[a-z][a-z0-9-]*$` — lowercase letters, digits, and hyphens.
- Unique within a registry (`./agents` directory for embedded, your workspace for hosted).
- **Inline agents inside pipelines still need an `id`.** It's what the trace span is named after.

### `name` *(optional)*

A human-readable label. Shown in the hosted UI, the embedded `runs.list` output, and trace titles. Defaults to a title-cased version of `id`.

### `description` *(optional)*

Free-form description. Surfaced in the agent picker and used by some tools (e.g. the [agent-as-tool](/docs/tools/agent-as-tool) kind passes it to the parent LLM as the tool's description). Keep it tight — one sentence beats three.

### `kind` *(required)*

Selects the agent type:

| Value | Behavior | Required fields |
|---|---|---|
| `llm` | Single language-model call | `model`, `instruction` |
| `tool` | Deterministic tool call, no model | `tool` |
| `sequential` | Run `steps` in order; optionally loops with `until` | `steps` |
| `parallel` | Run `branches` simultaneously, merge outputs | `branches` |

See [The four agent kinds](/docs/concepts/agent-kinds) for examples.

## Where to go next

- **[Input, state, and output](/docs/schema/input-state-output)** — how data flows in and out.
- **[Templates and conditions](/docs/schema/templates-conditions)** — the `{{}}` mini-language used in nearly every field.
- **[Resources](/docs/schema/resources)** — provider-backed runtime capabilities such as memory.
- **[Pipeline steps and looping](/docs/schema/pipeline-steps)** — fields specific to `sequential` and `parallel` kinds.
- **[Skills, spawnable, reply](/docs/schema/skills-spawnable-reply)** — extra fields for `llm` kind.


<!-- source: /docs/schema/input-state-output -->

# Input, state, and output

How data flows into and out of an agent. The same model applies to every `kind` — primitives consume their input, pipelines merge per-step outputs into a shared state object, and the agent's final result is shaped by `outputSchema` (LLM) or `output` (pipelines).

## Input

`inputSchema` declares the agent's input contract. Properties are listed directly; all are **required but nullable**.

```yaml
inputSchema:
  query: string
  language:
    type: string
    default: en
  format:
    type: string
    enum: [json, text, markdown]
```

Shorthand: `name: string` is equivalent to `name: { type: string }`. Supported types are `string`, `number`, `boolean`, `object`, and `array`. Use `enum` to restrict string values; use `default` to fall back when the caller omits the field.

If `inputSchema` is omitted, the agent accepts a plain string, accessible in templates as `{{userQuery}}`.

### Model config (LLM kind only)

```yaml
model:
  provider: openai            # openai | anthropic | google | mistral
  name: gpt-5.4
  temperature: 0.7            # optional
  maxTokens: 4096             # optional
  topP: 1.0                   # optional
```

### Instruction and prompt (LLM kind only)

```yaml
instruction: |               # required — the system prompt
  You are a math tutor. Explain each step clearly.

prompt: |                    # optional — user-message template
  Solve carefully: {{userQuery}}
```

- **`instruction`** is the system prompt. Templated with `{{}}` against state.
- **`prompt`** is the user message. When absent, the agent's raw input (`{{userQuery}}` or the input object stringified) is sent verbatim.

Splitting them lets the system prompt remain stable (and cache-friendly with providers that cache by prefix), while the user-message template changes per call.

## State

State is the working memory that pipeline steps share. It's a flat object scoped per agent — **sub-agents have their own state and cannot see the parent's**.

```
{
  ...input,                                              # input properties at root
  [stateKey ?? normalizeId(subAgent)]: subAgentOutput    # per sub-agent
}
```

Rules:

- `{{varName}}` references root input properties.
- `{{agentId.property}}` references a sub-agent's output property.
- `{{stateKey}}` references the entire output of a sub-agent (when `outputSchema` makes it a structured object) or its raw output.
- Unresolved references (skipped steps, first loop iteration) resolve to **null** — they don't throw.

`stateKey` lets you rename where a step's output lands. By default it lands under the sub-agent's id; `stateKey: writing` renames it for ergonomic downstream references.

## Output

### LLM agents — `outputSchema`

Constrains the model's response to a JSON object. The runner enforces the schema and returns parsed JSON, not a string.

```yaml
outputSchema:
  sentiment:
    type: string
    enum: [positive, negative, neutral]
  confidence: number
```

```ts
const { output } = await client.agents.run({
  agentId: "sentiment-analyzer",
  input: { text: "I love this!" },
});
// output = { sentiment: "positive", confidence: 0.95 }
```

Without `outputSchema`, the agent returns the model's raw text.

### Pipeline agents — `output`

Pipeline agents use `output` to map state to the result. Optional — defaults to the last step's output (sequential) or all branch outputs keyed by id (parallel).

```yaml
output:
  article: "{{writing.writer.draft}}"
  review: "{{writing.editor}}"
```

Anything in state is fair game — `output` is just a template substitution map.

## Examples (LLM kind)

Few-shot examples improve consistency. They're injected into the prompt before the user message.

```yaml
examples:
  - input: "I absolutely love this product!"
    output: '{"sentiment": "positive", "confidence": 0.95}'
  - input: "The package arrived on Tuesday."
    output: '{"sentiment": "neutral", "confidence": 0.88}'
```

When the agent has an `outputSchema`, examples should produce JSON that matches it.


<!-- source: /docs/schema/templates-conditions -->

# Templates and conditions

agntz uses a small templating language — handlebars-shaped, intentionally tiny — for variable interpolation in instructions, tool params, step inputs, and the `output` map. Conditional execution (`when`, `until`) uses the same syntax with a comparison-operator extension.

## Variable interpolation

`{{name}}` is replaced with the resolved value from state.

```yaml
instruction: |
  You are a writing assistant. Write about {{topic}} in a {{tone}} tone.

  {{#if feedback}}
  The reviewer provided feedback. Incorporate it:
  {{feedback}}
  {{/if}}

  {{#if language != en}}
  Write your response in {{language}}.
  {{/if}}
```

Rules:

- `{{varName}}` — replaced with the resolved value. **Null renders as empty.**
- Dotted paths like `{{researcher.summary}}` walk into a sub-agent's output.
- Unresolved references (skipped steps, first loop iteration) resolve to **null** — they don't throw.

## Conditional blocks

```yaml
{{#if varName}}                  # truthy: non-null, non-empty, non-zero
  ...
{{/if}}

{{#if varName == value}}          # equality
  ...
{{/if}}

{{#if varName != value}}          # inequality
  ...
{{/if}}
```

Blocks can be nested but cannot be parameterized — there's no `{{#each}}`, no helpers, no expression evaluation beyond `==` / `!=`.

## Conditions in `when` and `until`

Used at step level (`when`) and at sequential level (`until`). Evaluated against the resolved state.

```yaml
when: "{{language}} != en"
when: "{{feedback}}"                                     # truthiness
until: "{{score}} >= 0.8"
until: "{{score}} >= 0.8 && {{reviewer.approved}} == true"
```

Operators:

| Operator | Meaning |
|---|---|
| `==` | Equal |
| `!=` | Not equal |
| `>`, `<` | Numeric comparison |
| `>=`, `<=` | Numeric comparison |
| `&&` | Logical AND |
| `||` | Logical OR |

**Truthiness** = non-null, non-empty, non-zero. Strings, arrays, and objects are truthy if non-empty.

## Special namespaces

Some `{{...}}` references aren't state lookups — they're resolved by the runtime against the environment or the workspace's secret store.

```yaml
headers:
  Authorization: "Bearer {{env.SEARCH_KEY}}"   # embedded mode — reads process.env
  X-API-Key:     "{{secrets.WEATHER_TOKEN}}"   # hosted mode — reads workspace secrets
```

| Prefix | Source | Where supported |
|---|---|---|
| `{{env.NAME}}` | `process.env` | Embedded; hosted is opt-in per server |
| `{{secrets.NAME}}` | Workspace secret store | Hosted only |

In hosted mode, `{{env.X}}` is intentionally restricted — multi-tenant workers don't share an environment with your code. Use `{{secrets.X}}` for credentials and configure them in **Settings → Secrets** on the hosted edition.

## What's *not* in templates

agntz's templating is deliberately small. There's no:

- arbitrary expression evaluation
- loops (`{{#each}}`)
- helpers / partials
- string transforms
- Math

If you need to compute something, do it in a tool agent and pin the result into state, or do it client-side before calling the agent.


<!-- source: /docs/schema/resources -->

# Resources

`resources:` declares runtime capabilities an LLM agent may use, such as memory, RAG, files, or other provider-backed context. The manifest layer validates the generic shape; providers define the actual behavior.

```yaml
id: support-with-memory
kind: llm
model:
  provider: openai
  name: gpt-5.4
instruction: |
  Help the user. Use durable memory when it is relevant.
resources:
  memory:
    mode: read-write
    autoScan: true
  product-docs:
    kind: rag
    mode: read
    namespace: docs/product
```

Resources are currently an LLM-agent field. Pipeline agents can call LLM agents that declare resources, and the resource access follows the same run-time `context` grants.

## Field reference

### Resource name

The map key is the resource instance name:

```yaml
resources:
  memory:
    mode: read-write
```

Rules:

- Must match `^[a-zA-Z][a-zA-Z0-9_-]*$`.
- Used as the tool-name prefix.
- May contain hyphens, but generated tool prefixes replace non-identifier characters with `_`.

For example, a resource named `product-docs` with a provider tool named `search` becomes `product_docs_search`.

### `kind`

Provider kind. The runtime uses this to find the matching `ResourceProvider`.

```yaml
resources:
  user-memory:
    kind: memory
    mode: read-write
  org-memory:
    kind: memory
    mode: read
```

When omitted, `kind` defaults to the resource name. This shorthand is common for a single `memory` resource.

### `mode`

Per-agent access mode.

| Value | Meaning |
|---|---|
| `read` | Register read-safe provider tools only. |
| `read-write` | Register read and write provider tools. |

Providers may define a default when `mode` is omitted. The memory provider defaults to `read-write`.

Child agents cannot widen a parent's effective mode. If the parent has `mode: read`, children that use the same provider kind stay read-only.

### `namespace`

Optional static provider input.

```yaml
resources:
  product-docs:
    kind: rag
    mode: read
    namespace: docs/product
```

`namespace` is not an automatic runtime grant. It is provider configuration. Runtime access still comes from the `context` array passed to `client.agents.run(...)` or the worker HTTP API.

### `config` and provider-specific fields

Providers may read additional fields from the resource entry. These fields pass through the manifest layer unchanged.

```yaml
resources:
  memory:
    mode: read-write
    autoScan: true
    preload:
      core: true
      topics: [goals, equipment]
      limit: 30
      maxChars: 10000
      types: [fact, preference, summary]
    writePolicy:
      descendants: true
      ancestorPromotion: none
```

Use provider docs to know which fields are meaningful. For memrez, `autoScan` injects topic summaries and `preload` controls full-entry memory injected before tool calls. Topic taxonomy and reasoner policy are Memrez-level concerns, not agent resource fields. See [Memory with memrez](/docs/tools/memory-memrez).

## Generated tools

Resource provider tools are exposed as:

```text
<resource-name-prefix>_<provider-tool-name>
```

Examples:

| Resource | Provider tool | Model-visible tool |
|---|---|---|
| `memory` | `read` | `memory_read` |
| `memory` | `write` | `memory_write` |
| `product-docs` | `search` | `product_docs_search` |

The runtime rejects collisions with existing local tools or other generated resource tools. Rename the resource or the conflicting tool if this happens.

## Runtime grants

Declare the resource in YAML, then pass trusted namespace grants at run time:

```ts {group=resource-run}
await client.agents.run({
  agentId: "support-with-memory",
  input: "What do you remember about my preferences?",
  context: ["app/user/" + userId],
});
```

```python {group=resource-run}
client.agents.run(
    agent_id="support-with-memory",
    input="What do you remember about my preferences?",
    context=[f"app/user/{user_id}"],
)
```

Resource providers receive those grants through `ResourceToolContext.grants`. The model sees provider tools, not namespace arguments.

## Provider wiring

Embedded SDKs wire providers at client construction:

```ts {group=resource-provider}
const client = await agntz({
  agents: "./agents",
  resources: { memory: memrez.provider() },
});
```

```python {group=resource-provider}
client = agntz(
    agents="./agents",
    resources={"memory": memrez.provider()},
    model_provider=LiteLLMModelProvider(),
)
```

If an agent declares a resource kind and no provider is registered for that kind, startup or invocation fails with a provider-missing error. Hosted workers wire providers server-side.

## Where to go next

- **[Context and resources](/docs/concepts/context-and-resources)** - the runtime grant model.
- **[Memory with memrez](/docs/tools/memory-memrez)** - the built-in memory resource provider.


<!-- source: /docs/schema/pipeline-steps -->

# Pipeline steps and looping

Fields that apply to the pipeline kinds — `sequential` (`steps:`) and `parallel` (`branches:`). Every step is either a `ref` to an existing agent or an inline `agent` definition; both expose the same set of step-level fields.

## Step shape

```yaml
steps:
  - ref: agent-id
    input:
      paramX: "{{stateVar}}"           # maps parent state to child input
    stateKey: customKey                 # rename where output lands
    when: "{{condition}} == value"     # skip if false (output = null)

  - agent:
      id: inline-agent
      kind: llm
      model: { provider: openai, name: gpt-5.4 }
      instruction: "..."
```

All agents — including inline ones — require an `id`. It's what the trace span is named after, and what `stateKey` defaults to.

### `input` *(optional)*

Maps parent state into the child's input. Templates resolve against the parent's state.

```yaml
- ref: summarizer
  input:
    text: "{{researcher.body}}"
    language: "{{language}}"
```

If omitted, the child receives no explicit input. (If the child has an `inputSchema` with required fields, that's a load-time error.)

### `stateKey` *(optional)*

Renames where the child's output lands in parent state. By default, output lands under the child's `id`.

```yaml
- ref: researcher
  stateKey: factCheck            # downstream uses {{factCheck}} instead of {{researcher}}
```

### `when` *(optional)*

Skip the step if the condition is false. When skipped, the step's output is **null** — downstream references like `{{stepId.property}}` also resolve to null without throwing.

```yaml
- ref: translator
  when: "{{language}} != en"
  input: { text: "{{draft}}", lang: "{{language}}" }
```

`when` evaluates after templates resolve. See [Templates and conditions](/docs/schema/templates-conditions) for operator syntax.

## Looping (`sequential` only)

Set `until` at the pipeline level (not on a step) to repeat the step list until a condition holds. `maxIterations` is the safety stop.

```yaml
id: write-review-loop
kind: sequential

until: "{{reviewer.approved}} == true"
maxIterations: 5

steps:
  - ref: writer
    input:
      topic: "{{topic}}"
      feedback: "{{reviewer.feedback}}"   # null on first iteration
  - ref: reviewer
    input:
      draft: "{{writer.draft}}"
```

Loop semantics:

- Steps run in declared order each iteration.
- `until` is checked **after** each full pass; the loop exits when it's true.
- State carries over between iterations — each iteration sees the previous one's outputs at `{{stepId.property}}`.
- On the first iteration, references to outputs that haven't been computed yet resolve to **null**.
- `maxIterations` is required when `until` is set; the loop fails fast if exceeded.

## Branches (`parallel` only)

`branches` look identical to `steps` but run concurrently. There's no `until` or `when` at the parallel level — branches always run unconditionally. Use `when` on individual branches to gate them.

```yaml
id: text-analysis
kind: parallel

inputSchema:
  text: string

branches:
  - ref: sentiment-analyzer
    input: { text: "{{text}}" }
  - ref: entity-extractor
    input: { text: "{{text}}" }
    when: "{{text}}"               # skip if text is empty
```

If no `output:` is declared, the result is the merged outputs of all branches keyed by their `id` (or `stateKey`).

## Error handling

Pipelines **fail fast**. If any step fails, the entire pipeline fails immediately. There's no per-step retry config in the manifest — handle retries at the caller level via run options or wrap the pipeline in code:

```ts
try {
  await client.agents.run({ agentId: "my-pipeline", input });
} catch (err) {
  if (err.code === "TOOL_TIMEOUT") {
    // retry, log, escalate, ...
  }
}
```

A failed step's error is captured in the trace and surfaces in `runs.list({ status: "error" })`.


<!-- source: /docs/schema/skills-spawnable-reply -->

# Skills, spawnable, reply

Three optional fields available only on `kind: llm` agents. Each exposes a different runtime capability to the model — mid-run skill loading, concurrent sub-agents, and streaming intermediate messages.

## Skills

Named skill bundles the agent may load mid-run via the synthetic `use_skill` tool. The model decides when (and whether) to load them, based on the task at hand.

```yaml
skills:
  - citation-style
  - markdown-rendering
```

Rules:

- Skill names must match `^[a-z][a-z0-9-]*$`.
- Names resolve against the runtime's **SkillStore** — same store you configure in `agntz({ skills: ... })` (embedded) or in **Settings → Skills** (hosted).
- The model sees skill names + descriptions in its tool list; it can call `use_skill` to pull a skill's full instructions into context.
- Calling a skill twice is a no-op — the runtime tracks what's loaded.

Skills are how you split very large instructions into reusable bundles without inflating every prompt. The model loads only what it actually needs.

## Spawnable

Sub-agents the LLM may spawn concurrently at runtime via the synthetic `spawn_agent` tool. Predefined per agent — **the LLM cannot invent agents**, only invoke the ones you list.

```yaml
spawnable:
  - kind: ref
    agentId: fact-checker
  - kind: inline
    definition:
      id: adhoc-helper
      kind: llm
      model: { provider: openai, name: gpt-5.4-mini }
      instruction: "Extract dates from the input"
```

Rules:

- `kind: ref` — reference an existing agent by id.
- `kind: inline` — define the sub-agent inline. Must itself be `kind: llm` with a **static** (non-templated) instruction.
- The model sees the spawnable agents in its tool list and calls `spawn_agent({ id, input })` to invoke one.
- Spawned agents run in parallel with the parent; each gets its own state and its own trace span nested under the parent.

Use spawnable when you want the model to fan out work it identifies during execution — for example, fact-checking each claim in a draft.

## Reply

Register a per-invocation `reply` tool the model can call to deliver intermediate messages. Replies surface as SSE events on streaming endpoints, so your UI can show progress before the final answer.

```yaml
reply: true                  # defaults: maxPerRun = 50

# or
reply:
  maxPerRun: 5
```

The model sees a tool called `reply` with a single string parameter; calling it emits a `reply` event in the run's stream. Pair with `client.agents.stream(...)` to surface progress to a user:

```ts
for await (const event of client.agents.stream({
  agentId: "long-task",
  input: { ... },
})) {
  if (event.type === "reply") ui.append(event.text);
  if (event.type === "complete") ui.finalize(event.output);
}
```

Replies are stored on the session alongside messages, so when a session resumes the model can see what it told the user previously.

## Compatibility

All three fields work in **embedded** and **hosted** mode. `spawn_agent` and `use_skill` are synthetic tools — the runtime injects them; you don't need to declare them under `tools:`.


<!-- ============================================================== -->
<!-- Tools -->
<!-- ============================================================== -->

<!-- source: /docs/tools/local -->

# Local tools

Local tools are functions registered at runtime and referenced by name in YAML. They are the simplest and fastest tool kind — no network and no auth — but they only work in embedded mode because hosted workers cannot execute arbitrary user code.

```yaml [agents/calculator.yaml]
id: calculator
kind: llm
model: { provider: openai, name: gpt-5.4-mini }
instruction: |
  Use the `add` tool to answer math questions.

  {{userQuery}}
tools:
  - kind: local
    tools: [add]
```

```ts [index.ts] {group=local-tool-basic}
import { agntz, tool, z } from "@agntz/sdk";

const client = await agntz({
  agents: "./agents",
  tools: [
    tool({
      name: "add",
      description: "Add two numbers and return the sum",
      input: z.object({
        a: z.number().describe("First operand"),
        b: z.number().describe("Second operand"),
      }),
      execute: async ({ a, b }) => a + b,
    }),
  ],
});
```

```python [main.py] {group=local-tool-basic}
from pydantic import BaseModel, Field
from agntz import LiteLLMModelProvider, agntz, tool


class AddInput(BaseModel):
    a: float = Field(description="First operand")
    b: float = Field(description="Second operand")


def add(args: AddInput) -> float:
    return args.a + args.b


client = agntz(
    agents="./agents",
    tools=[
        tool(
            name="add",
            description="Add two numbers and return the sum",
            input_schema=AddInput,
            execute=add,
        )
    ],
    model_provider=LiteLLMModelProvider(),
)
```

Names referenced in YAML but missing from the local tool registry fail before a successful run. This keeps misconfigurations out of production traffic.

## Tool shape

Each tool is self-describing. The model sees the `name`, `description`, and JSON Schema derived from your validation schema.

| TypeScript field | Python field | Purpose |
| --- | --- | --- |
| `name` | `name` | Identifier referenced from YAML `tools: [name]` |
| `description` | `description` | What the tool does; read by the model |
| `input` | `input_schema` | Zod or Pydantic schema used for validation and JSON Schema |
| `execute` | `execute` | Function called with parsed args |

```ts {group=local-tool-shape}
import { tool, z } from "@agntz/sdk";

const tools = [
  tool({
    name: "fetchInvoice",
    description: "Look up an invoice record by its id",
    input: z.object({
      id: z.string().describe("Invoice id, e.g. inv_abc123"),
    }),
    execute: async ({ id }) => {
      return await db.invoices.findById(id);
    },
  }),
];
```

```python {group=local-tool-shape}
from pydantic import BaseModel, Field
from agntz import tool


class FetchInvoiceInput(BaseModel):
    id: str = Field(description="Invoice id, e.g. inv_abc123")


def fetch_invoice(args: FetchInvoiceInput):
    return db.invoices.find_by_id(args.id)


tools = [
    tool(
        name="fetchInvoice",
        description="Look up an invoice record by its id",
        input_schema=FetchInvoiceInput,
        execute=fetch_invoice,
    )
]
```

TypeScript uses Zod because `@agntz/sdk` already depends on it. Python uses Pydantic because it is the native validation and schema path for Python applications.

## Selective exposure

By default, listing `tools: [foo, bar]` exposes exactly those tools. To expose all configured tools, drop the inner array:

```yaml
tools:
  - kind: local            # all tools in the registry
```

## Errors

If a tool throws, the model receives the error message and can decide whether to retry with different args, reply to the user, or give up. The error is captured in the tool span and appears in traces.

Validation errors are returned to the model the same way, so a model that calls `add({ a: "two", b: 3 })` sees a structured complaint and can correct itself on the next step.

## Why embedded-only?

> **Note:** Local tools are an embedded-mode primitive. The hosted edition has no way to run arbitrary user code in a sandbox, so promote local tools to HTTP endpoints or MCP servers when you graduate. The YAML can switch between local and HTTP/MCP without touching the agent's instruction — only the `tools:` block changes.

If you want a single manifest that runs in both local and hosted modes, prefer [HTTP](/docs/tools/http) or [MCP](/docs/tools/mcp) tools from the start.


<!-- source: /docs/tools/http -->

# HTTP tools

A single HTTP endpoint exposed to the model as a tool. URL placeholders define the LLM-facing parameter schema. `GET`, `POST`, `PUT`, `PATCH`, and `DELETE` are all supported. Works identically in embedded and hosted mode.

```yaml
tools:
  - kind: http
    name: weather_lookup
    url: "https://api.weather.com/v1/forecast/{location}{?units}"
    description: "Look up weather forecast for a location"
    params:
      units: "metric"           # pin the optional query param; hidden from the LLM
    headers:
      Authorization: "Bearer {{secrets.WEATHER_TOKEN}}"
```

## URL placeholder syntax

The URL template encodes the tool's parameter schema:

| Syntax | Meaning |
|---|---|
| `{X}` | Required path or query parameter |
| `{X?}` | Optional query parameter |
| `{?units}` | Required query (alt form) |
| `{?units&format}` | Multiple required query params |

Each placeholder becomes a parameter the model sees. The placeholder name is the parameter name; the type defaults to string. Use `params:` to **pin** a value — the parameter disappears from the model's tool schema and is filled by the template engine instead.

Headers are templated too — they can reference env vars (`{{env.NAME}}` in embedded mode) or secrets (`{{secrets.NAME}}` in hosted mode). See [Templates and conditions](/docs/schema/templates-conditions#special-namespaces).

## POST / PUT / PATCH with a request body

```yaml
tools:
  - kind: http
    name: create_user
    url: "https://api.example.com/users"
    method: POST
    body_type: json            # json (default), form, or query
    body:
      name: "{{userName}}"
      email: "{{userEmail}}"
```

`body_type`:

- `json` *(default)* — serializes `body` as JSON, sets `Content-Type: application/json`.
- `form` — URL-encoded form body.
- `query` — appends `body` properties to the URL's query string (useful when the only "body" you want is on a GET-like request that has many params).

Body fields can be templates (`{{userName}}`) or pinned literals. Templated fields the LLM provides go in the body; literals don't appear in the model's parameter schema.

## Dynamic auth — OAuth2 client credentials

For APIs that require fetching a short-lived access token before each call, declare an `auth:` block. The runner fetches the token, caches it (refreshes on 401), and applies it — no code required.

```yaml
tools:
  - kind: http
    name: send_message
    url: "https://api.salesforce.com/services/data/v60.0/sobjects/Message"
    method: POST
    body_type: json
    body: { content: "{{message}}" }
    auth:
      type: oauth2_client_credentials
      token_url: "https://login.salesforce.com/services/oauth2/token"
      client_id: "{{secrets.SF_CLIENT_ID}}"
      client_secret: "{{secrets.SF_CLIENT_SECRET}}"
      scope: "messages:write"          # optional
      creds_location: basic_header     # default (RFC 6749); or "body"
```

## Dynamic auth — generic token exchange

For login endpoints that don't match the OAuth2 spec — different field names, plain-text token responses, custom header names — use the parametric `token_exchange` form:

```yaml
tools:
  - kind: http
    name: list_things
    url: "https://api.example.com/things"
    auth:
      type: token_exchange
      request:
        url: "https://api.example.com/auth/login"
        method: POST
        body_type: json
        body:
          username: "{{secrets.API_USER}}"
          password: "{{secrets.API_PASS}}"
      extract:
        response_format: json          # default; "text" for raw-body tokens
        token_path: "$.access_token"   # JSONPath; e.g. "$.token", "$.data.accessToken"
        expires_path: "$.expires_in"   # optional, seconds
      apply:
        location: header               # default; or "query"
        name: Authorization
        format: "Bearer {token}"       # default for header; "{token}" for query
      cache_ttl: 3000                  # optional, seconds
      refresh_on: [401]                # default
```

## What you get for free

When you declare `auth:`, the runner provides:

- **Per-tenant token caching** keyed by ownerId — tokens are not shared across users.
- **Single-flight dedup** of concurrent token requests — only one fetch when many tools need the same token.
- **Automatic refresh-on-401** with one retry, no infinite loops.
- **Redaction of known token / secret substrings** from response bodies and error messages — tokens never leak into traces.

## Static auth

For APIs with long-lived keys, skip `auth:` and template the credential directly into headers:

```yaml
tools:
  - kind: http
    name: openai_completions
    url: "https://api.openai.com/v1/chat/completions"
    method: POST
    headers:
      Authorization: "Bearer {{secrets.OPENAI_KEY}}"
    body_type: json
    body:
      model: "gpt-5.4"
      messages: "{{messages}}"
```

## Failures

HTTP tool failures are captured in the `tool.execute` span. The model sees the error body (truncated, redacted) and can decide whether to retry or give up. Non-2xx responses are treated as errors by default.


<!-- source: /docs/tools/mcp -->

# MCP tools

[Model Context Protocol](https://modelcontextprotocol.io) servers expose discoverable tool catalogs. Reference a server URL and the runner connects, lists the available tools, and exposes them to the model.

```yaml
tools:
  - kind: mcp
    server: https://search-api.example.com/mcp
    tools:
      - fetch_url                       # use as-is
      - tool: search                    # wrapped tool
        name: search_for_user           # what the LLM sees
        description: "Search records by query"
        params:
          api_key: "{{env.SEARCH_KEY}}"   # pinned, hidden from the LLM
```

## Selective vs full exposure

Drop the inner `tools:` array to expose every tool the server advertises:

```yaml
tools:
  - kind: mcp
    server: https://search-api.example.com/mcp     # all tools
```

List specific tools to expose only those:

```yaml
tools:
  - kind: mcp
    server: https://search-api.example.com/mcp
    tools: [search, fetch_url]
```

## Wrapping a tool

Use the long form (`tool:`) to rename a tool, override its description, or pin parameters:

```yaml
tools:
  - kind: mcp
    server: https://mcp.example.com/sse
    tools:
      - tool: search
        name: search_current_user        # optional rename
        description: "Search the current user's records"
        params:
          user_id: "{{userId}}"           # state-templated, hidden from the LLM
```

This is how you ground tools in per-invocation context (user id, tenant id, scopes) **without** trusting the model to pass them correctly. The pinned params are injected at execution and hidden from the LLM's schema.

## Auth

MCP servers handle auth at the protocol level. The runner forwards `headers:` (templated like HTTP tools) on the underlying connection:

```yaml
tools:
  - kind: mcp
    server: https://api.example.com/mcp
    headers:
      Authorization: "Bearer {{secrets.MCP_TOKEN}}"
```

For SSE-based MCP servers, headers are sent on the long-lived connect; for HTTP-streaming servers they're sent on each request.

## Connection lifecycle

In **embedded** mode, the runner connects lazily on first tool call and reuses the connection for the process lifetime. No connection store required.

In **hosted** mode, connections are pooled per workspace and recycled on idle. `{{env.X}}` is opt-in per server (because multi-tenant workers don't share an environment with your code) — prefer `{{secrets.X}}` for credentials.

## Failures

If the MCP server is down at runtime, the tool call fails — captured in the trace as a `tool.execute` span with status `error`. The model sees a sanitized error message and can decide whether to retry or give up.

If the server is down at **load time**, embedded mode logs a warning and continues; the failure surfaces on first call. This is intentional — agents that depend on remote services shouldn't refuse to boot just because a downstream is briefly unavailable.


<!-- source: /docs/tools/memory-memrez -->

# Memory with memrez

memrez is the durable memory resource for agntz agents. It stores tagged facts, preferences, events, and summaries under namespace scopes, then exposes memory to LLM agents through the generic `resources:` system.

It is not session history. It is not the legacy `contextIds` scratchpad. memrez is long-lived resource state guarded by runtime `context` namespace grants.

## Install

```bash {group=memrez-install select=ts}
pnpm add @agntz/memrez
```

```bash {group=memrez-install select=python}
pip install "agntz[litellm]"
```

The TypeScript package is published as `@agntz/memrez`. The Python package exports matching core storage and provider primitives from `agntz.memrez`, `agntz.memrez_llm_reasoner`, `agntz.memrez_sqlite`, `agntz.memrez_postgres`, and `agntz.memrez_provider`.

## Declare a memory resource

```yaml [agents/support-with-memory.yaml]
id: support-with-memory
name: Support with Memory
kind: llm
model:
  provider: openai
  name: gpt-5.4
instruction: |
  Help the user. Use memory when it is relevant, and write only stable facts or preferences.
resources:
  memory:
    mode: read-write
    autoScan: true
```

When this agent runs, the memory provider can add visible topic summaries to the prompt and expose tools named `memory_read` and `memory_write`.

Use `mode: read` when an agent may read memory but must not write it. In read mode, the write tool is not registered.

## Wire the provider

```ts [index.ts] {group=memrez-provider}
import { agntz } from "@agntz/sdk";
import { createMemrez, SqliteMemoryStore } from "@agntz/memrez";

const memrez = createMemrez({
  store: new SqliteMemoryStore("./memory.db"),
});

const client = await agntz({
  agents: "./agents",
  resources: { memory: memrez.provider() },
});
```

```python [main.py] {group=memrez-provider}
from agntz import LiteLLMModelProvider, agntz
from agntz.memrez import create_memrez
from agntz.memrez_sqlite import SqliteMemoryStore

memrez = create_memrez(store=SqliteMemoryStore("./memory.db"))

client = agntz(
    agents="./agents",
    resources={"memory": memrez.provider()},
    model_provider=LiteLLMModelProvider(),
)
```

The key in `resources: { memory: ... }` is the provider kind. It must match the manifest resource kind. If the manifest omits `kind`, the resource name is used as the kind.

## Run with namespace grants

Pass `context` from trusted application code. Do not ask the model to pick a namespace.

```ts {group=memrez-run}
await client.agents.run({
  agentId: "support-with-memory",
  input: "Remember that I prefer metric units.",
  context: ["app/user/" + userId],
});
```

```python {group=memrez-run}
client.agents.run(
    agent_id="support-with-memory",
    input="Remember that I prefer metric units.",
    context=[f"app/user/{user_id}"],
)
```

The memory tools receive the normalized grant list. A write can only land inside a writable scope allowed by those grants and the memory write policy.

## Read and write directly

You can also use memrez outside an agent, which is useful for tests, backfills, and admin jobs.

```ts {group=memrez-direct}
const grants = ["app/user/" + userId];

await memrez.write(grants, "Prefers metric units.", {
  topicsHint: ["preferences"],
});

const entries = await memrez.read(grants, "preferences", { limit: 10 });
```

```python {group=memrez-direct}
grants = [f"app/user/{user_id}"]

memrez.write(
    grants,
    "Prefers metric units.",
    topics_hint=["preferences"],
)

entries = memrez.read(grants, "preferences", limit=10)
```

Direct calls use the same grant validation as resource tool calls.

## Storage options

| Store | TypeScript | Python | Use case |
|---|---|---|---|
| In-memory | `InMemoryMemoryStore` | default `create_memrez()` store | Tests and demos |
| SQLite | `SqliteMemoryStore` | `SqliteMemoryStore` | Local apps and single-node deployments |
| Postgres | `PostgresMemoryStore` | `PostgresMemoryStore` | Multi-process and hosted deployments |

```ts {group=memrez-store}
import { createMemrez, PostgresMemoryStore } from "@agntz/memrez";

const memrez = createMemrez({
  store: new PostgresMemoryStore(process.env.DATABASE_URL!),
});
```

```python {group=memrez-store}
import os
from agntz.memrez import create_memrez
from agntz.memrez_postgres import PostgresMemoryStore

memrez = create_memrez(
    store=PostgresMemoryStore(os.environ["DATABASE_URL"]),
)
```

## Auto-scan

`autoScan: true` lets the provider inject a small list of visible memory topics before the model starts tool calling.

```text
## Resource: memory
Memory topics visible to this run:
- preferences (3) - durable user preferences
- billing (1)
```

Set `autoScan: false` when you want the model to discover memory only through explicit `memory_read` calls.

## Preload

Agent resource config controls what memory is inlined into the run context. Topic taxonomy and reasoner policy belong to Memrez-level configuration, not the agent manifest.

```yaml
resources:
  memory:
    kind: memory
    mode: read-write

    preload:
      core: true
      topics: [goals, equipment]
      limit: 30
      maxChars: 10000
      types: [fact, preference, summary]
```

Omit `preload` when you only want topic summaries and explicit `memory_read` calls. Use `preload.core: true` to include the Memrez core topic, and `preload.topics` for additional topic slices. `preload.limit` caps entries, `preload.maxChars` caps rendered context, and `preload.types` filters entry types.

The shorthands `preload: true`, `preload: all`, and `preload: [goals, equipment]` are still supported. Prefer the object form for new agents.

Do not set `resources.memory.topics` in an agent manifest. The memrez resource provider rejects agent-level topic taxonomy config; agents should choose preload slices, while memrez owns tagging policy.

## Write policy

By default, memrez writes to the current grant or one of its descendants. It does not promote writes to ancestors unless configured.

```yaml
resources:
  memory:
    mode: read-write
    writePolicy:
      descendants: true
      ancestorPromotion: none
```

Use ancestor promotion only for trusted agents that are designed to curate shared memory. Normal user-facing agents should receive narrow grants such as `app/user/u_123`.

## Reasoning layer

memrez uses a reasoner to organize memory writes — choosing topics, entry type, normalized content, and target namespace — and to curate scopes. By default `createMemrez({ store })` and `create_memrez(store=...)` wire a built-in LLM reasoner that makes direct model calls, keyed from your provider env var (e.g. `OPENAI_API_KEY`). That is why the agent's `memory_write` tool takes content only: filing the entry is memrez's job, not the agent's.

Override the reasoner when you want different models, or no LLM at all for tests / emergency fallback:

```ts
import {
  createMemrez,
  llmReasoner,
  DeterministicReasoner,
} from "@agntz/memrez";

createMemrez({ store, reasoner: llmReasoner({ taggerModel: { provider: "anthropic", name: "claude-haiku-4-5" } }) });
createMemrez({ store, reasoner: new DeterministicReasoner() }); // no LLM (tests / kill-switch)
```

```python
from agntz.memrez import DeterministicReasoner, create_memrez
from agntz.memrez_llm_reasoner import ReasonerModelConfig, llm_reasoner

create_memrez(
    store=store,
    reasoner=llm_reasoner(
        tagger_model=ReasonerModelConfig(provider="anthropic", name="claude-haiku-4-5")
    ),
)
create_memrez(store=store, reasoner=DeterministicReasoner())  # no LLM (tests / kill-switch)
```

memrez does not run its tagger or curator through the agntz agent loop yet.
Those steps are bounded structured model calls owned by memrez, which avoids
circular setups where memory writes invoke agents that can themselves use
memory.

The reasoner may propose a namespace, but memrez validates it before writing. The model cannot bypass the grant boundary.

## Curation

Curation is a memrez capability, not an agent manifest feature. Embedded apps can call `memrez.curate(grants)` directly or inspect dirty work with `memrez.store.listDirtyTopics()`. The Agntz worker can run periodic sweeps when `MEMREZ_CURATE_INTERVAL` is set, for example `30m` or `1h`. Leave that env var unset when you want to trigger curation from your own scheduler or admin job.

## Hosted use

Hosted workers accept the same run-time `context` field over the HTTP API and hosted clients. The deployment decides which resource providers are wired into the worker. When using a hosted memory provider, mint grants from authenticated server-side state and pass them with the run request.

## Where to go next

- **[Context and resources](/docs/concepts/context-and-resources)** - how namespace grants work.
- **[Resources schema](/docs/schema/resources)** - every `resources:` field.
- **[Hosted client](/docs/sdk-cli/client)** - passing `context` to hosted runs.


<!-- source: /docs/tools/agent-as-tool -->

# Agent-as-tool

Expose another agent as a callable tool. The parent LLM decides when to delegate, and the child agent runs as a nested span in the parent's trace.

```yaml
tools:
  - kind: agent
    agent: researcher
```

The model sees a tool with the child agent's `name` and `description`, parameters derived from its `inputSchema`, and a return type derived from its `outputSchema`.

When the model calls the tool, the child agent runs to completion — model calls, tool calls, sub-pipelines and all — and the child's output is returned to the parent model. The child's trace appears nested under the parent's, complete with its own `model.call` and `tool.execute` spans.

## When to use

- **Decomposition.** Break a complex task into specialist agents and let the orchestrator delegate.
- **Reuse.** A research agent used in three different workflows is exactly that — three pipelines, one agent.
- **Boundary.** Use a child agent as the boundary between a planning LLM (which decides what to do) and a doing LLM (which executes with a different model, instruction, or toolset).

## Versus spawnable

[Spawnable](/docs/schema/skills-spawnable-reply#spawnable) and agent-as-tool look similar but differ in two ways:

| Feature | agent-as-tool | spawnable |
|---|---|---|
| Concurrency | Sequential — model calls tool, waits | Concurrent — `spawn_agent({ id, ... })` fires off multiple in parallel |
| Granularity | One specific agent per tool | A list of allowed agents the model can pick from |

Use agent-as-tool when there's a clear "specialist" being called; use spawnable when you want fan-out (e.g. fact-check multiple claims at once).

## Tool wrapping

For MCP and HTTP tools — not agent-as-tool — you can pin parameters from state. They're injected at execution and hidden from the LLM's schema. This is how you ground tools in per-invocation context (user id, tenant id, secrets) without trusting the model to pass them.

```yaml
tools:
  - kind: mcp
    server: https://mcp.example.com/sse
    tools:
      - tool: search
        name: search_current_user      # optional rename
        description: "Search the current user's records"
        params:
          user_id: "{{userId}}"        # state-templated, hidden
```

See [MCP tools](/docs/tools/mcp#wrapping-a-tool) and [HTTP tools](/docs/tools/http#url-placeholder-syntax) for details.


<!-- ============================================================== -->
<!-- SDK & CLI -->
<!-- ============================================================== -->

<!-- source: /docs/sdk-cli/sdk -->

# Embedded SDK

The embedded runner reads YAML manifests from disk, registers them in an in-process runtime, and runs them locally with no network hop. Use `@agntz/sdk` in TypeScript or `agntz` in Python. Both load the same agent YAML and expose the same resource shape with language-native option names.

```bash {group=sdk-install select=ts}
pnpm add @agntz/sdk
```

```bash {group=sdk-install select=python}
pip install "agntz[litellm]"
```

Node 20+ for TypeScript. Python 3.11+ for Python. Universal clients that cannot read from the local filesystem should use the hosted client instead.

## Basic usage

```ts [index.ts] {group=sdk-basic}
import { agntz, tool, z } from "@agntz/sdk";

const client = await agntz({
  agents: "./agents",
  tools: [
    tool({
      name: "add",
      description: "Add two numbers and return the sum",
      input: z.object({ a: z.number(), b: z.number() }),
      execute: async ({ a, b }) => a + b,
    }),
  ],
  onEvent: (event) => {
    if (event.type === "tool-call-start") console.log("→", event.toolCall.name);
  },
});

const { output, state } = await client.agents.run({
  agentId: "support",
  input: { message: "Hello" },
});
```

```python [main.py] {group=sdk-basic}
from pydantic import BaseModel
from agntz import LiteLLMModelProvider, agntz, tool


class AddInput(BaseModel):
    a: float
    b: float


def add(args: AddInput) -> float:
    return args.a + args.b


client = agntz(
    agents="./agents",
    tools=[
        tool(
            name="add",
            description="Add two numbers and return the sum",
            input_schema=AddInput,
            execute=add,
        )
    ],
    model_provider=LiteLLMModelProvider(),
)

result = client.agents.run(
    agent_id="support",
    input={"message": "Hello"},
)
output = result.output
state = result.state
```

## `agntz(options)`

Returns an initialized local client. Validation errors throw at startup, so misconfigured agents do not make it past process boot.

| TypeScript option | Python option | Description |
|---|---|---|
| `agents` | `agents` | Path to a directory of `.yaml` files |
| `tools` | `tools` | Local tool definitions |
| `resources` | `resources` | Resource providers keyed by kind, such as `memory` |
| `store` | `store` | Optional persistence |
| `defaultModel` | `model_provider` | Python passes a concrete provider; TypeScript can default model config |
| `onEvent` | N/A | TypeScript event hook for full local event stream |

## Runtime API

### Run an agent

```ts {group=sdk-run}
const { output, state, sessionId } = await client.agents.run({
  agentId: "summarize",
  input: { text: longArticle },
  sessionId: "user-42",
});
```

```python {group=sdk-run}
result = client.agents.run(
    agent_id="summarize",
    input={"text": long_article},
    session_id="user-42",
)
output = result.output
state = result.state
session_id = result.session_id
```

### Run with resource grants

Pass `context` when the run needs access to a resource such as memory. This is a namespace grant array, not the legacy `contextIds` scratchpad.

```ts {group=sdk-context}
const { output } = await client.agents.run({
  agentId: "support-with-memory",
  input: "What do you remember about my preferences?",
  context: ["app/user/" + userId],
});
```

```python {group=sdk-context}
result = client.agents.run(
    agent_id="support-with-memory",
    input="What do you remember about my preferences?",
    context=[f"app/user/{user_id}"],
)
output = result.output
```

Agents that declare `resources:` also need matching providers at construction time:

```ts {group=sdk-resources}
import { createMemrez } from "@agntz/memrez";

const memrez = createMemrez();
const client = await agntz({
  agents: "./agents",
  resources: { memory: memrez.provider() },
});
```

```python {group=sdk-resources}
from agntz.memrez import create_memrez

memrez = create_memrez()
client = agntz(
    agents="./agents",
    resources={"memory": memrez.provider()},
    model_provider=LiteLLMModelProvider(),
)
```

See [Context and resources](/docs/concepts/context-and-resources), [Resources schema](/docs/schema/resources), and [Memory with memrez](/docs/tools/memory-memrez).

### Stream or inspect

```ts {group=sdk-stream}
for await (const event of client.agents.stream({
  agentId: "summarize",
  input: { text: longArticle },
  signal: AbortSignal.timeout(30_000),
})) {
  if (event.type === "text-delta") process.stdout.write(event.text);
  if (event.type === "complete") return event.output;
}
```

```python {group=sdk-stream}
for event in client.agents.stream(
    agent_id="summarize",
    input={"text": long_article},
):
    if event.type == "complete":
        print(event.output)
```

TypeScript local streaming includes token deltas and tool-loop events. Python local streaming currently emits start and complete snapshots; use the hosted Python client for full worker SSE streaming.

### Runs and traces

```ts {group=sdk-runs}
const { rows } = await client.runs.list({ agentId, status, limit: 10 });
const run = await client.runs.get(rows[0].id);
const trace = await client.traces.get(rows[0].id);
```

```python {group=sdk-runs}
runs = client.runs.list(agent_id=agent_id, status="completed")
run = client.runs.get(runs[0].id)
trace_rows = client.traces.list(agent_id=agent_id)
trace = client.traces.get(trace_rows["rows"][0]["traceId"])
```

## Persistence

```ts {group=sdk-persistence}
import { agntz } from "@agntz/sdk";
import { sqliteStore } from "@agntz/sdk/sqlite";

const client = await agntz({
  agents: "./agents",
  store: sqliteStore("./agntz.db"),
});
```

```python {group=sdk-persistence}
from agntz import LiteLLMModelProvider, SQLiteStore, agntz

client = agntz(
    agents="./agents",
    store=SQLiteStore("./agntz.db"),
    model_provider=LiteLLMModelProvider(),
)
```

The same store backs sessions, runs, and traces. Python's SQLite store persists messages and trace spans in the same file.

## Errors

```ts {group=sdk-errors}
import { AgntzError, NotFoundError, StreamError } from "@agntz/sdk";

try {
  await client.agents.run({ agentId: "unknown", input: {} });
} catch (err) {
  if (err instanceof NotFoundError) {
    // unknown agent id
  }
}
```

```python {group=sdk-errors}
try:
    client.agents.run(agent_id="unknown", input={})
except RuntimeError as exc:
    print(exc)
```

The hosted clients expose structured HTTP error classes. Local embedded execution raises Python or TypeScript runtime errors directly.

## Switching to hosted

When you're ready to graduate, swap constructors and keep the same resource shape:

```diff {group=sdk-hosted}
- import { agntz } from "@agntz/sdk";
+ import { AgntzClient } from "@agntz/client";

- const client = await agntz({ agents: "./agents" });
+ const client = new AgntzClient({
+   apiKey: process.env.AGNTZ_API_KEY!,
+   baseUrl: "https://api.agntz.co",
+ });
```

```python {group=sdk-hosted}
import os
from agntz import AgntzClient

client = AgntzClient(
    api_key=os.environ["AGNTZ_API_KEY"],
    base_url="https://api.agntz.co",
)
```

`agents.run`, `runs.list`, and `traces.get` stay the same. Local tools must be promoted to HTTP or MCP servers when the runtime moves out of your process.


<!-- source: /docs/sdk-cli/client -->

# Hosted client

The hosted client calls agents on `agntz.co` or your self-hosted worker over HTTPS. TypeScript uses `@agntz/client`; Python uses `agntz.AgntzClient` or `agntz.AsyncAgntzClient`. Both talk to the same worker API.

```bash {group=client-install select=ts}
pnpm add @agntz/client
```

```bash {group=client-install select=python}
pip install agntz
```

Same resource shape as the embedded SDK — code is portable between local and hosted modes once your local tools are HTTP or MCP tools.

## Basic usage

```ts [index.ts] {group=client-basic}
import { AgntzClient } from "@agntz/client";

const client = new AgntzClient({
  apiKey: process.env.AGNTZ_API_KEY!,    // ar_live_...
  baseUrl: "https://api.agntz.co",       // or your self-hosted worker URL
});

const { output, state } = await client.agents.run({
  agentId: "support-agent",
  input: { message: email.body, customerId: email.from },
});
```

```python [main.py] {group=client-basic}
import os
from agntz import AgntzClient

client = AgntzClient(
    api_key=os.environ["AGNTZ_API_KEY"],
    base_url="https://api.agntz.co",
)

result = client.agents.run(
    agent_id="support-agent",
    input={"message": email.body, "customerId": email.from},
)
output = result.output
state = result.state
```

## Async usage

```ts {group=client-async}
for await (const event of client.agents.stream({
  agentId: "support-agent",
  input: { message: "Hello" },
})) {
  if (event.type === "complete") console.log("output", event.output);
  if (event.type === "error") console.error(event.error);
}
```

```python {group=client-async}
import os
from agntz import AsyncAgntzClient

async with AsyncAgntzClient(
    api_key=os.environ["AGNTZ_API_KEY"],
    base_url="https://api.agntz.co",
) as client:
    async for event in client.agents.stream(
        agent_id="support-agent",
        input={"message": "Hello"},
    ):
        if event.type == "complete":
            print("output", event.output)
        if event.type == "error":
            print("error", event.error)
```

## Constructor options

```ts {group=client-constructor}
new AgntzClient({
  apiKey: "ar_live_...",
  baseUrl: "https://api.agntz.co",
});
```

```python {group=client-constructor}
AgntzClient(
    api_key="ar_live_...",
    base_url="https://api.agntz.co",
)
```

## API surface

### `client.agents.run(...)`

Run an agent to completion. Returns `{ output, state, sessionId, replies }` in TypeScript and the same fields as Python attributes such as `result.session_id`.

### `client.agents.stream(...)`

Streams SSE events. Always yields a terminal `complete` or `error` event.

### Runtime context grants

Pass `context` when a hosted run needs access to a resource such as memory. These are namespace grants minted by trusted server-side code; the model never receives a namespace parameter.

```ts {group=client-context}
const result = await client.agents.run({
  agentId: "support-with-memory",
  input: "What do you remember about me?",
  sessionId: "user-42",
  context: ["app/user/u_123"],
});
```

```python {group=client-context}
result = client.agents.run(
    agent_id="support-with-memory",
    input="What do you remember about me?",
    session_id="user-42",
    context=["app/user/u_123"],
)
```

The worker must be configured with matching resource providers. See [Context and resources](/docs/concepts/context-and-resources) and [Memory with memrez](/docs/tools/memory-memrez).

### `client.runs.*`

```ts {group=client-runs}
const run = await client.runs.start({ agentId, input: { /* ... */ } });
const fresh = await client.runs.get(run.id);
await client.runs.cancel(run.id);

const { rows, nextCursor } = await client.runs.list({
  agentId,
  status,
  limit,
});
```

```python {group=client-runs}
run = client.runs.start(agent_id=agent_id, input={})
fresh = client.runs.get(run.id)
client.runs.cancel(run.id)

rows = client.runs.list(
    agent_id=agent_id,
    status="completed",
    limit=20,
)
```

### `client.traces.*`

```ts {group=client-traces}
const trace = await client.traces.get(runId);
const list = await client.traces.list({ status: "error" });
await client.traces.delete(traceId);
```

```python {group=client-traces}
trace = client.traces.get(run_id)
traces = client.traces.list(status="error")
client.traces.delete(trace_id)
```

## Sessions

Pass the same session id across calls to continue a conversation. The hosted runtime auto-loads and appends history.

```ts {group=client-sessions}
await client.agents.run({ agentId: "support", input: "Hi", sessionId: "user-42" });
await client.agents.run({ agentId: "support", input: "follow-up", sessionId: "user-42" });
```

```python {group=client-sessions}
client.agents.run(agent_id="support", input="Hi", session_id="user-42")
client.agents.run(agent_id="support", input="follow-up", session_id="user-42")
```

Sessions are managed automatically and scoped to your user. See [Sessions](/docs/concepts/sessions).

## Errors

```ts {group=client-errors}
import { AuthenticationError, NotFoundError, RateLimitError } from "@agntz/client";

try {
  await client.agents.run({ agentId: "unknown", input: {} });
} catch (err) {
  if (err instanceof NotFoundError) {
    // 404 — unknown agent id
  }
  if (err instanceof RateLimitError) {
    // 429 — back off
  }
}
```

```python {group=client-errors}
from agntz import AuthenticationError, NotFoundError

try:
    client.agents.run(agent_id="unknown", input={})
except NotFoundError:
    # 404 — unknown agent id
    pass
except AuthenticationError:
    # 401 — invalid or revoked API key
    pass
```

## Authentication

External clients send `Authorization: Bearer ar_live_...`. Keys are issued in **Settings → API Keys** on `agntz.co` or your self-hosted UI. For browser usage, never embed an `ar_live_*` key client-side; proxy through your own backend and inject the key server-side.

## Self-host with the same client

The hosted client works against any Agntz worker — the public `api.agntz.co` or your own deployment.

```ts {group=client-self-host}
const client = new AgntzClient({
  apiKey: process.env.AGNTZ_API_KEY!,
  baseUrl: "https://agntz-worker.mycompany.com",
});
```

```python {group=client-self-host}
client = AgntzClient(
    api_key=os.environ["AGNTZ_API_KEY"],
    base_url="https://agntz-worker.mycompany.com",
)
```


<!-- source: /docs/sdk-cli/cli -->

# CLI reference

The `agntz` CLI ships inside `@agntz/sdk`. It creates YAML manifests, runs agents locally, and manages hosted runs and traces from the terminal.

```bash
# Run without installing
npx @agntz/sdk --help

# Or install globally
npm i -g @agntz/sdk
agntz --help
```

For the first local workflow, start with [CLI getting started](/docs/cli-quickstart).

## Command map

| Command | Local? | Hosted? | Auth? | Purpose |
|---|---:|---:|---:|---|
| `create` | - | ✓ | No | Generate YAML from a description through the hosted builder. |
| `run <path>` | ✓ | - | No | Run a local YAML file or single-agent directory. |
| `run <id>` | - | ✓ | Yes | Run a hosted agent by id. |
| `login` / `logout` / `whoami` | - | ✓ | Mixed | Manage hosted API credentials. |
| `runs` | - | ✓ | Yes | List, inspect, stream, or cancel hosted runs. |
| `traces` | - | ✓ | Yes | List, inspect, or delete hosted traces. |

Every command supports terminal help:

```bash
agntz create --help
agntz run --help
agntz login --help
agntz runs --help
agntz traces --help
```

## Auth and configuration

Hosted commands read credentials in this order:

1. `AGNTZ_API_KEY`
2. `~/.agntz/config.json`, written by `agntz login`

API URL resolution uses:

1. command `--url` where supported
2. `AGNTZ_API_URL`
3. saved config
4. `https://api.agntz.co`

Local runs do not require an agntz API key. They use provider keys from your process environment, such as `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or other keys required by the manifest's model/tool configuration.

## `create`

```bash
agntz create "<description>" [options]
```

Generates a YAML manifest by calling the hosted agent-builder. No login is required.

| Flag | Description |
|---|---|
| `-o, --output <path>` | Write the manifest to a specific path. Default: `./agents/<id>.yaml`. |
| `--stdout` | Print YAML to stdout instead of writing a file. |
| `--current-manifest <path>` | Revise an existing manifest instead of starting fresh. |
| `--url <apiUrl>` | Override the builder API URL for this call. |
| `-h, --help` | Show command help. |

Examples:

```bash
agntz create "Answer support questions in a concise tone" -o ./agents/support.yaml

agntz create "Add an HTTP order lookup tool" \
  --current-manifest ./agents/support.yaml \
  -o ./agents/support.yaml

agntz create "Classify inbound leads by urgency" --stdout > ./agents/lead-classifier.yaml
```

`create` validates that the builder returned YAML, parses the manifest to get its `id`, creates parent directories as needed, and prints the local `run` command to try next.

## `run`

```bash
agntz run <path-or-id> [options] [input...]
```

Runs an agent. The target determines local vs hosted mode unless you force a mode.

| Target shape | Mode |
|---|---|
| `./agents/support.yaml` | Local YAML file |
| `agents/support.yml` | Local YAML file |
| `./agents` | Local directory, only if it contains exactly one manifest |
| `support` | Hosted agent id |

| Flag | Description |
|---|---|
| `--input <text>` | Input string. Use `--input -` to read stdin. |
| `--session <id>` | Reuse a session id across calls. |
| `--stream` | Stream reply/complete/error events instead of buffering the final output. |
| `--local` | Force local execution. |
| `--remote` | Force hosted execution. |
| `-h, --help` | Show command help. |

Input resolution:

```text
--input value > trailing positional text > piped stdin > empty string
```

Examples:

```bash
# Local file
agntz run ./agents/support.yaml --input "How do I reset my password?"

# Local file with stdin
cat ticket.txt | agntz run ./agents/support.yaml

# Local file with persistent conversation state
agntz run ./agents/support.yaml --session user-42 --input "My email changed"

# Hosted agent id
agntz run support --input "Hello" --remote

# Stream hosted or local output
agntz run ./agents/support.yaml --input "Walk me through this" --stream
```

Local runtime boundary: `agntz run ./agents/support.yaml` constructs a local SDK client with `agntz({ agents: "<manifest-dir>" })`. It can run agents whose requirements are satisfied by YAML plus environment configuration. If the agent declares local tools or resource providers that need application code, call `@agntz/sdk` from your service and pass `tools` / `resources` there.

## `login`, `logout`, and `whoami`

```bash
agntz login --key <apiKey> [--url <apiUrl>]
agntz logout
agntz whoami
```

`login` writes credentials to `~/.agntz/config.json` with owner-only permissions. `logout` removes that file. `whoami` prints the resolved API URL and a masked key source.

Examples:

```bash
agntz login --key ar_live_...
agntz login --key ar_live_... --url https://agntz-worker.example.com
AGNTZ_API_KEY=ar_live_... agntz whoami
agntz logout
```

Browser-based login is not implemented in the current CLI. Paste an API key from the hosted or self-hosted dashboard.

## `runs`

Hosted run management. Requires `AGNTZ_API_KEY` or `agntz login`. Output is JSON.

```bash
agntz runs list   [--agent <id>] [--status <s>] [--limit <n>] [--cursor <c>]
agntz runs get    <runId>
agntz runs stream <runId> [--since <seq>]
agntz runs cancel <runId>
```

Examples:

```bash
agntz runs list --agent support --limit 20
agntz runs get run_123
agntz runs stream run_123 --since 10
agntz runs cancel run_123
```

`runs stream` emits the multiplexed event stream for a hosted run subtree. `--since <seq>` resumes from a sequence number.

## `traces`

Hosted trace management. Requires `AGNTZ_API_KEY` or `agntz login`.

```bash
agntz traces list   [--agent <id>] [--status <s>] [--limit <n>] [--cursor <c>]
agntz traces get    <traceId>
agntz traces delete <traceId>
```

Examples:

```bash
agntz traces list --agent support --status failed --limit 10
agntz traces get trace_123
agntz traces delete trace_123
```

## Current CLI boundary

The current CLI command surface is intentionally small:

```text
create, run, login, logout, whoami, runs, traces
```

The current CLI does not provide project scaffolding, eval execution, validation-only execution, an interactive playground, or a Studio launcher. Use the SDK docs for in-process validation/runtime wiring, and use the hosted app for managed agent editing.

## Exit behavior

| Exit code | Meaning |
|---|---|
| `0` | Success |
| `1` | Argument, auth, network, builder, validation, or runtime error |

The CLI writes human-readable errors to stderr. For structured programmatic integration, use `@agntz/sdk` for local execution or `@agntz/client` for hosted execution.


<!-- ============================================================== -->
<!-- Deploy -->
<!-- ============================================================== -->

<!-- source: /docs/deploy/hosted-cloud -->

# Hosted cloud

The hosted edition at **agntz.co** gives you the same runtime plus a managed multi-tenant UI. Sign up, create an agent, run it — no infrastructure.

## What you get in the UI

- **Agent editor** — YAML manifest editor with live schema validation, plus AI-assisted build-from-description.
- **Playground** — per-agent interactive runner with SSE streaming, conversational sessions.
- **Sessions & logs** — browse conversation history and invocation traces with span detail.
- **Tool catalog** — list the inline / MCP tools available to your workspace.
- **Providers** — manage your LLM provider keys per workspace.
- **API keys** — generate `ar_live_*` keys for programmatic access from your apps.
- **Auth** — Clerk-backed sign-in / sign-up; every record is scoped to your `userId`.

## From UI to code in one step

Create an agent in the UI, then call it with the same SDK code you'd use locally — just point the SDK at the hosted worker:

```ts {group=hosted-cloud-call}
import { AgntzClient } from "@agntz/client";

const client = new AgntzClient({
  apiKey: process.env.AGNTZ_API_KEY!,
  baseUrl: "https://api.agntz.co",
});

const { output } = await client.agents.run({
  agentId: "support-agent",     // the id you set in the UI editor
  input: { message: "Hello" },
});
```

```python {group=hosted-cloud-call}
import os
from agntz import AgntzClient

client = AgntzClient(
    api_key=os.environ["AGNTZ_API_KEY"],
    base_url="https://api.agntz.co",
)

result = client.agents.run(
    agent_id="support-agent",     # the id you set in the UI editor
    input={"message": "Hello"},
)
```

Every UI-side change is versioned, every run is traced — same observability model as embedded.

## Versioning

Every save creates a new version of the agent. Production resolves `support-agent` to the **pinned** version; in-flight edits never reach users until you pin them. The version that produced any given trace is recorded with the trace, so you can jump from a run straight to the exact manifest that ran it.

## Bring your own model keys

agntz never proxies model calls. The worker calls OpenAI / Anthropic / Google / Mistral directly using the keys you configure in **Settings → Providers**. Your data goes from the worker to the provider and back; we don't see prompt or completion bodies.

For your own org's provider keys, set them at the workspace level. For per-tool secrets (e.g. an external API token used by an HTTP tool), set them in **Settings → Secrets** and reference them in YAML as `{{secrets.NAME}}`.

## API keys

Generate keys in **Settings → API Keys**. Keys are prefixed `ar_live_` and are scoped to the workspace that minted them. The worker sha256-hashes the key on receipt and resolves it to a user id — the plaintext key is never stored.

```bash
# Use it
export AGNTZ_API_KEY=ar_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
agntz whoami
```

Revoking a key disables it immediately; existing runs continue to completion.

## Limits

The hosted edition has fair-use limits on:

- **Concurrent runs** per workspace
- **Run duration** (default cap; configurable on paid plans)
- **API requests per minute** (rate-limited; see `RateLimitError` retry-after)

Self-host if you need higher limits or full control — see [Self-host in production](/docs/deploy/self-host-production).


<!-- source: /docs/deploy/self-host-docker -->

# Self-host with Docker

The whole stack is open source under MIT. The fastest way to get it running on your own hardware is the bundled `docker-compose.yml` — it spins up Postgres, the worker, the app, and the marketing site in one command.

## What gets deployed

| Service | Role | Port |
|---|---|---|
| `@agntz/app` | Next.js 15 web UI (Clerk auth + organizations, agent editor, playground) | 3000 |
| `@agntz/worker` | Hono HTTP worker — executes agents, exposes `/run` and `/run/stream` | 4001 |
| Postgres | Backing store for sessions, runs, traces, agents | 5432 |
| `@agntz/site` | Marketing site (optional) | 3001 |

## One-command bootstrap

```bash
git clone https://github.com/aparry3/agntz
cd agntz
cp .env.example .env.local
# fill in CLERK_*, WORKER_INTERNAL_SECRET, OPENAI_API_KEY
docker compose up
```

UI at `http://localhost:3000`, worker at `http://localhost:4001`.

## Required env vars

The `.env.example` lists every variable. The non-optional ones:

| Variable | Where used | Notes |
|---|---|---|
| `NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY` | app | From Clerk Dashboard → API Keys. Enable Organizations for shared workspaces. |
| `CLERK_SECRET_KEY` | app | Same source |
| `WORKER_INTERNAL_SECRET` | app + worker | Must be identical on both. Generate with `openssl rand -base64 32`. |
| `DATABASE_URL` | app + worker | Defaults to the compose-provided Postgres. |
| `OPENAI_API_KEY` (or any provider key) | worker | At least one provider key for default models. |
| `DEFAULT_MODEL_PROVIDER`, `DEFAULT_MODEL_NAME` | worker | Fallback when an agent omits `model:`. |
| `MEMREZ_STORE` | worker | Optional. Defaults from `STORE`; set `postgres` to force Postgres-backed memory. |
| `MEMREZ_REASONER` | worker | Optional. `llm` by default; `deterministic` is the emergency no-LLM fallback. |
| `MEMREZ_CURATE_INTERVAL` | worker | Optional. Enables periodic memory curation, e.g. `30m` or `1h`. |

When `STORE=postgres`, the worker wires the memrez memory resource provider by
default. Agents that declare `resources.memory` can use `memory_read` and
`memory_write`; curation runs only when `MEMREZ_CURATE_INTERVAL` is set or
when you call the curation endpoint manually.

## First-run flow

1. Open `http://localhost:3000`. Clerk shows sign-in / sign-up.
2. Sign up, then optionally create or switch to an organization from the sidebar.
   Records are scoped to the active organization; personal workspaces fall back to your Clerk user id.
3. Hit **Create agent**, paste a description or write YAML directly, save.
4. Click **Playground**, run the agent, watch the trace.
5. Generate an API key in **Settings → API Keys**, then call your local worker from code:

```ts
const client = new AgntzClient({
  apiKey: "ar_live_...",
  baseUrl: "http://localhost:4001",
});
```

## Logs & data

- App logs: `docker compose logs -f app`
- Worker logs: `docker compose logs -f worker`
- Postgres data: the `db_data` named volume — `docker volume inspect agntz_db_data` to find it on disk.

## Resetting

To wipe local state and start fresh:

```bash
docker compose down -v       # -v removes the Postgres volume
docker compose up
```

## Production?

Compose is great for local dev and small internal deployments, but for a public deployment we recommend the split deploy on Vercel + Railway — see [Self-host in production](/docs/deploy/self-host-production).


<!-- source: /docs/deploy/self-host-production -->

# Self-host in production

Recommended split for a production self-hosted deployment: Next.js apps on **Vercel**, worker + Postgres on **Railway**.

The deployable surface is three packages:

| Package | Role | Where it goes |
|---|---|---|
| `@agntz/app` | Next.js 15 web UI (Clerk auth, agent editor, playground) | Vercel |
| `@agntz/worker` | Hono HTTP worker — executes agents | Railway |
| `@agntz/store-postgres` | Postgres store adapter — user-scoped tables | (used by worker + app) |

## 1. Provision Postgres on Railway

```
Railway → New Project → Add Service → Database → PostgreSQL
```

Copy the private `DATABASE_URL` and the public TCP proxy URL from the Variables tab. The worker uses the private `DATABASE_URL`; the Vercel app uses the public TCP proxy URL as its `DATABASE_URL` because it is outside Railway's private network. Schema is initialized on worker boot — no manual migration step.

## 2. Deploy the worker on Railway

Same Railway project → **Add Service** → **GitHub Repo** → select your fork.

- **Root directory:** `/`
- **Build:** Dockerfile, target stage `worker`
- **Port:** `4001`
- **Env vars:**
  - `STORE=postgres`
  - `DATABASE_URL=${{Postgres.DATABASE_URL}}`
  - `PORT=4001`
  - `WORKER_INTERNAL_SECRET=$(openssl rand -base64 32)`
  - `DEFAULT_MODEL_PROVIDER=openai`
  - `DEFAULT_MODEL_NAME=gpt-5.4`
  - `OPENAI_API_KEY=sk-...`
  - `MEMREZ_STORE=postgres`
  - `MEMREZ_REASONER=llm`
  - `MEMREZ_CURATE_INTERVAL=30m` (optional)
  - (any other provider keys you'll use)

Generate a public domain in **Settings → Networking**; you'll need it for the app.

With `STORE=postgres`, the worker can wire the memrez memory provider against
Postgres. `MEMREZ_CURATE_INTERVAL` enables the worker's periodic dirty-topic
curation sweep; omit it if you prefer to call the memory curation endpoint or
library primitives from your own scheduler.

## 3. Set up Clerk

Sign up at clerk.com, create an application, copy the **Publishable** and **Secret** keys from the API Keys page. Enable **Organizations** for hosted Cloud-style workspaces, role-based access, and enterprise SSO.

## 4. Deploy the app on Vercel

```
Vercel → New Project → Import your repo
- Root directory: packages/app
- Framework preset: Next.js
```

Env vars:

```
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=pk_...
CLERK_SECRET_KEY=sk_...
NEXT_PUBLIC_CLERK_SIGN_IN_URL=/sign-in
NEXT_PUBLIC_CLERK_SIGN_UP_URL=/sign-up
NEXT_PUBLIC_CLERK_SIGN_IN_FALLBACK_REDIRECT_URL=/agents
NEXT_PUBLIC_CLERK_SIGN_UP_FALLBACK_REDIRECT_URL=/agents
WORKER_URL=https://<your-worker>.up.railway.app
WORKER_INTERNAL_SECRET=...           # MUST match the worker
STORE=postgres
DATABASE_URL=...                     # Railway public TCP proxy URL
DEFAULT_MODEL_PROVIDER=openai
DEFAULT_MODEL_NAME=gpt-5.4
OPENAI_API_KEY=sk-...
```

`WORKER_INTERNAL_SECRET` must be identical on both sides — the app authenticates to the worker with it.

Do not set Vercel's `DATABASE_URL` to a Railway `*.railway.internal` URL. That hostname only resolves inside Railway.

## 5. (Optional) Deploy the marketing site on Vercel

The marketing site at `packages/site` is a separate Vercel project — no env vars required.

```
Root directory: packages/site
```

## 6. DNS

Suggested layout for a custom domain:

| Hostname | Project | Purpose |
|---|---|---|
| `yourdomain.com` | site | Marketing |
| `www.yourdomain.com` | site | Marketing (alias) |
| `app.yourdomain.com` | app | Product UI |

In your registrar, add the records Vercel lists (typically A `76.76.21.21` for apex, CNAME `cname.vercel-dns.com` for subdomains). Vercel auto-issues certs once DNS resolves.

In Clerk → **Domains** — add the production URL as an allowed origin, swap test keys for production keys, redeploy.

## Architecture

```
 Browser ──(Clerk session + active org)──► app (Next.js) ──(signed tenant context)──► worker (Hono)
 External caller ──(Bearer ar_live_...)─────────────────────────────────────► worker
                                                                                  │
                                                                                  ▼
                                                                        Postgres (tenant-owner scoped)
```

The worker accepts two auth modes:

- **Internal** — `X-Internal-Secret` header + `X-Agntz-Internal-Auth` signed tenant context. Used by the app on behalf of signed-in workspaces.
- **External** — `Authorization: Bearer ar_live_<token>` from a key generated in **Settings → API Keys**. The worker sha256-hashes the key and resolves it to a workspace owner key.

Every store row is scoped to the active Clerk organization id, falling back to the Clerk `userId` for personal workspaces. The app never sees another workspace's data.

## Operating the deployment

- **Logs.** Railway streams worker logs in its UI; Vercel does the same for the app. Wire both into your observability stack if you have one.
- **Scaling.** The worker is stateless — scale it horizontally by raising Railway's replica count. The app is similarly stateless on Vercel.
- **Database.** A managed Postgres with daily backups is sufficient for most teams. Run migrations via worker boot only — we don't ship a separate migration runner.
- **Updating.** Push to your fork → Railway and Vercel auto-deploy. Pin the worker image tag if you want manual control over rollouts.

## See also

- **[HTTP API reference](/docs/deploy/http-api)** — endpoints the worker exposes.
- **[Hosted cloud](/docs/deploy/hosted-cloud)** — managed alternative.


<!-- source: /docs/deploy/http-api -->

# HTTP API reference

The worker exposes a small HTTP surface. The SDK (`@agntz/client`) wraps it; you can also call it directly.

## Endpoints

| Method | Path | Auth | Description |
|---|---|---|---|
| `GET` | `/health` | none | Liveness probe |
| `POST` | `/run` | required | Execute an agent, return final output + state |
| `POST` | `/run/stream` | required | Same, as Server-Sent Events |
| `POST` | `/runs` | required | Start a run, return its handle immediately |
| `GET` | `/runs/:id` | required | Fetch current state of a run |
| `POST` | `/runs/:id/cancel` | required | Cancel a run and cascade to descendants |
| `GET` | `/runs` | required | List runs (filters: `agentId`, `status`, time range) |
| `GET` | `/runs/:id/stream` | required | Multiplexed event stream for a run subtree |
| `GET` | `/traces` | required | List traces |
| `GET` | `/traces/:id` | required | Trace detail with spans |
| `GET` | `/traces/:id/stream` | required | Live trace events while running |
| `DELETE` | `/traces/:id` | required | Delete a trace |
| `POST` | `/build-agent` | none | Public agent-builder endpoint used by `agntz create` |

## Authentication

The worker accepts two auth modes:

### External — Bearer token

```
Authorization: Bearer ar_live_<token>
```

The worker sha256-hashes the key on receipt, looks it up in the API keys table, and resolves the request to a user id. This is what `@agntz/client` sends.

### Internal — shared secret + userId

```
X-Internal-Secret: <WORKER_INTERNAL_SECRET>
```

Used by the app calling the worker on behalf of a signed-in user. The body must include `userId` (the Clerk user id):

```json
{
  "userId": "user_abc...",
  "agentId": "my-agent",
  "input": { "message": "Hello" }
}
```

Don't expose this secret to clients — it's app-to-worker only.

## Request shape

```json
{
  "userId": "user_abc...",        // required with internal auth; ignored with Bearer
  "agentId": "my-agent",
  "input": { "message": "Hello" },
  "sessionId": "optional-session-id",
  "context": ["app/user/u_123"]
}
```

`input` accepts either a plain string (when the agent has no `inputSchema`) or an object matching the agent's schema.

`context` is optional. When present, it is a namespace grant array passed to resource providers such as memory. Mint it from trusted server-side state, such as the authenticated user or workspace. Do not ask the model or a browser client to choose grants.

`/run`, `/run/stream`, and `/runs` all accept the same `agentId`, `input`, `sessionId`, and `context` fields. `/runs` also accepts webhook fields such as `callbackUrl` and `webhookSecretName`.

## Stream format (SSE)

`/run/stream`, `/runs/:id/stream`, and `/traces/:id/stream` emit Server-Sent Events.

```
event: stream
data: {"type": "text-delta", "text": "Hello"}

event: stream
data: {"type": "complete", "output": "Hello, world!", "state": {...}}
```

Reconnect with the `Last-Event-ID` header (or `?since=<seq>` for the multiplexed run stream) to resume from where you left off. Servers may send `:keepalive` comments every 15s to defeat proxy idle timeouts.

## System agents

Invoke a system agent — bundled with the worker, not user-defined — by prefixing the id with `system:`:

```json
{ "agentId": "system:agent-builder", "input": { "description": "..." } }
```

The default `agent-builder` powers the UI's "Create from description" feature and the CLI's `agntz create` command. System agents bypass the user's store and run with ephemeral in-memory state.

## Public endpoints

A couple of endpoints are intentionally unauthenticated:

- `GET /health` — for load balancers and uptime checks.
- `POST /build-agent` — the public agent-builder, called by `agntz create` (no login). Rate-limited by IP.

Everything else requires an API key or the internal secret.

## Errors

The worker returns JSON error bodies with a stable `code`:

```json
{
  "error": {
    "code": "AGENT_NOT_FOUND",
    "message": "No agent with id 'unknown' in workspace ws_xxx",
    "status": 404
  }
}
```

| HTTP status | Common codes |
|---|---|
| 400 | `INVALID_INPUT`, `SCHEMA_VALIDATION` |
| 401 | `AUTH_MISSING`, `AUTH_INVALID` |
| 404 | `AGENT_NOT_FOUND`, `RUN_NOT_FOUND` |
| 409 | `RUN_CANCELLED` |
| 429 | `RATE_LIMITED` (includes `Retry-After` header) |
| 500 | `INTERNAL` |

The SDK maps these to typed errors (`AuthenticationError`, `NotFoundError`, `RateLimitError`, ...). See [@agntz/client → Errors](/docs/sdk-cli/client#errors).


<!-- ============================================================== -->
<!-- Reference -->
<!-- ============================================================== -->

<!-- source: /docs/models -->

# Models & providers

agntz calls model providers directly with your API key — there's no proxy and no data routing through our servers. You configure a provider by exporting its API key as an environment variable (embedded mode) or saving it in **Settings → Connections** (hosted / self-hosted).

## Supported providers

| Provider | Env var | Provider id |
|---|---|---|
| OpenAI | `OPENAI_API_KEY` | `openai` |
| Anthropic | `ANTHROPIC_API_KEY` | `anthropic` |
| Google | `GOOGLE_GENERATIVE_AI_API_KEY` | `google` |
| **OpenRouter** | `OPENROUTER_API_KEY` | `openrouter` |
| Mistral | `MISTRAL_API_KEY` | `mistral` |
| xAI | `XAI_API_KEY` | `xai` |
| Groq | `GROQ_API_KEY` | `groq` |
| DeepSeek | `DEEPSEEK_API_KEY` | `deepseek` |
| Perplexity | `PERPLEXITY_API_KEY` | `perplexity` |
| Cohere | `COHERE_API_KEY` | `cohere` |
| Azure OpenAI | `AZURE_OPENAI_API_KEY` | `azure` |

## Picking a model in a manifest

```yaml
model:
  provider: anthropic
  name: claude-sonnet-4-6
  temperature: 0
```

`provider` is the id from the table above; `name` is the exact model id the provider expects (e.g. `gpt-5.4-mini`, `claude-sonnet-4-6`, `gemini-3-pro`).

## OpenRouter — one key, hundreds of models

[OpenRouter](https://openrouter.ai) is a meta-provider that proxies to virtually every commercial and open-source model behind a single API key. Use it when you want to:

- Access **open-source models** (Llama, Mistral, DeepSeek, Qwen, …) without standing up your own inference.
- Try many models without juggling per-provider API keys.
- Take advantage of OpenRouter's routing, fallbacks, and unified billing.

Set the key and reference any OpenRouter model by its slug (`<author>/<model>`):

```bash
export OPENROUTER_API_KEY=sk-or-...
```

```yaml
model:
  provider: openrouter
  name: anthropic/claude-sonnet-4
```

```yaml
model:
  provider: openrouter
  name: meta-llama/llama-3.3-70b-instruct
```

```yaml
model:
  provider: openrouter
  name: deepseek/deepseek-chat
```

Free-tier models are available via the `:free` suffix (subject to OpenRouter's rate limits):

```yaml
model:
  provider: openrouter
  name: meta-llama/llama-3.3-70b-instruct:free
```

OpenRouter reports the per-request USD cost on every response, so traces in the UI show actual spend instead of an estimate.

### Attribution

By default, requests through OpenRouter are attributed to your app with the headers `HTTP-Referer: https://agntz.co` and `X-Title: agntz` (used by OpenRouter's public rankings). Override via the provider's stored `config`:

```json
{ "referer": "https://your-app.com", "title": "Your App" }
```

## Other providers, custom endpoints

Every provider supports a `baseUrl` override in its stored config — useful for proxies and OpenAI-compatible gateways. For arbitrary providers not in the table above, supply a custom `modelProvider` implementation to `createRunner`.


<!-- source: /docs/compatibility -->

# Compatibility matrix

What runs where, today. Embedded means in-process SDK execution: `@agntz/sdk` for TypeScript and `agntz` for Python. Hosted means `agntz.co` and self-hosted workers.

| Feature | TS embedded | Python embedded | Hosted worker |
|---|:---:|:---:|:---:|
| LLM agents | ✓ | ✓ | ✓ |
| Sequential / parallel / tool kinds | ✓ | ✓ | ✓ |
| Local tools | ✓ (JS/TS) | ✓ (Python) | (use MCP / HTTP instead) |
| HTTP tools | ✓ | ✓ | ✓ |
| HTTP tools — OAuth2 / token exchange | ✓ | partial | ✓ |
| MCP tools (raw URL + headers) | ✓ | ✓ (HTTP JSON-RPC) | ✓ |
| Agent-as-tool | ✓ | ✓ | ✓ |
| Runtime `context` namespace grants | ✓ | ✓ | ✓ |
| `resources:` manifest declarations | ✓ | ✓ | ✓ if provider wired |
| Generic resource provider runtime | ✓ | ✓ | self-host configurable |
| memrez memory resource provider | ✓ | ✓ | self-host configurable |
| memrez SQLite / Postgres memory stores | ✓ | ✓ | deployment-owned |
| memrez built-in LLM reasoner default | ✓ | ✓ | ✓ |
| memrez preload context policy | ✓ | ✓ | ✓ |
| Spawnable subagents | ✓ | not yet | ✓ |
| Skills (`use_skill` tool) | ✓ | not yet | ✓ |
| Reply tool (intermediate messages) | ✓ | persisted messages only | ✓ |
| Sessions | ✓ (memory or sqlite) | ✓ (memory or sqlite) | ✓ (managed) |
| Runs & traces | ✓ (ring buffer / sqlite) | ✓ (memory or sqlite) | ✓ (Postgres) |
| Local streaming for LLM agents | ✓ (full event stream) | start / complete snapshots | N/A |
| Hosted SSE streaming | ✓ | ✓ | ✓ |
| OpenTelemetry export | ✓ | not yet | ✓ |
| `{{env.X}}` template refs | ✓ | not yet | opt-in per server |
| `{{secrets.X}}` template refs | × | × | ✓ |
| Versioning + pinning | × | × | ✓ |
| Multi-user isolation | × | × | ✓ |
| API key auth | × | × | ✓ |
| Web UI (editor, playground, traces) | × | × | ✓ |
| Evals UI | × | × | roadmap |

## Migration paths

### Embedded → hosted

Most of the way is a constructor change (see [Embedded SDK → Switching to hosted](/docs/sdk-cli/sdk#switching-to-hosted)). The main fixes are:

- **Local tools** — promote to HTTP endpoints or MCP servers. The YAML `tools:` block is the only place the change is visible.
- **`{{env.X}}` → `{{secrets.X}}`** — multi-tenant workers do not share an environment with your code. Use `{{secrets.X}}` and configure values in **Settings → Secrets**.
- **Resources** — make sure the hosted worker has the same provider kinds wired server-side. Runtime `context` grants still come from trusted application code.

### TypeScript embedded → Python embedded

Keep the same YAML manifest. Translate only the host language code:

```ts {group=compat-run}
await client.agents.run({
  agentId: "support",
  input: { message: "Hello" },
  sessionId: "user-42",
});
```

```python {group=compat-run}
client.agents.run(
    agent_id="support",
    input={"message": "Hello"},
    session_id="user-42",
)
```

The Python SDK follows Python naming conventions, so wire names become `agent_id` and `session_id` while YAML fields remain unchanged.

Resource and memory APIs use the same pattern: TypeScript passes `resources: { memory: memrez.provider() }`; Python passes `resources={"memory": memrez.provider()}`. Both embedded runtimes support memrez's built-in LLM reasoner default plus agent-side `preload` config.

### Hosted → self-hosted

The hosted clients work against any worker — `api.agntz.co` or your own. Switch by setting `baseUrl` / `base_url` and using an API key minted on your self-hosted UI.

## Resources

- **GitHub:** [github.com/aparry3/agntz](https://github.com/aparry3/agntz) — source, issues, discussions.
- **npm:** `@agntz/sdk`, `@agntz/client`, `@agntz/store-sqlite`, `@agntz/store-postgres`, `@agntz/manifest`.
- **Python:** `agntz` package with optional `agntz[litellm]` local model support.
- **License:** MIT.
- **AI-friendly:** Every page exposes its raw markdown via the Copy button; the full corpus is at [/llms.txt](/llms.txt).