Skip to main content
@ai_kit/server ships a pre-configured Hono application that turns agents and workflows into HTTP endpoints. It handles synchronous calls, streaming (SSE), human-in-the-loop resumes and can publish an OpenAPI spec with Swagger UI.

Installation

npm i @ai_kit/server
Install it next to @ai_kit/core so you can reuse the agents and workflows you already defined. Want a preconfigured project instead? Run:
npx @ai_kit/create-ai-kit server-kit

Minimal example

import { Agent, createWorkflow } from "@ai_kit/core";
import { ServerKit } from "@ai_kit/server";

const supportAgent = new Agent({
  name: "support",
  instructions: "Answer internal tickets with short messages.",
  model: /* plug your ai-sdk provider here */ {} as any,
});

const enrichWorkflow = createWorkflow({
  id: "enrich-ticket",
  description: "Add metadata before dispatching the ticket."
});

const server = new ServerKit({
  agents: { support: supportAgent },
  workflows: { "enrich-ticket": enrichWorkflow },
});

await server.listen({ port: 8787 });
The server listens on HOST (default 0.0.0.0) and PORT (default 8787). Agents and workflows are stored in memory and reused across requests.
Need to protect the API? Head over to the Authentication guide to enable bearer tokens and custom guards.

Exposed endpoints

RouteDescription
GET /api/agentsLists every agent registered in the server configuration.
POST /api/agents/:id/generateExecutes Agent.generate once and returns the full result.
POST /api/agents/:id/streamStreams Agent.stream via SSE or DataStreamResponse.
GET /api/workflowsLists every workflow registered in the server configuration.
POST /api/workflows/:id/runStarts a workflow run and returns { runId, ...result }.
POST /api/workflows/:id/streamStreams every WorkflowEvent plus the final status via SSE.
POST /api/workflows/:id/runs/:runId/resumeResumes a suspended run with human input.
Workflow payloads must contain inputData (optional metadata). Agent payloads must include either prompt or messages, matching the AI SDK contract.

Calling the server from ClientKit

Use @ai_kit/client-kit to reach the server from another Node.js service or edge worker:
import { ClientKit } from "@ai_kit/client-kit";

const client = new ClientKit({
  baseUrl: "https://agents.internal.aidalinfo.fr",
  headers: { Authorization: `Bearer ${process.env.SERVER_TOKEN}` },
});

const generation = await client.generateAgent("support", {
  prompt: "What changed this week?",
  runtime: {
    metadata: { tenant: "aidalinfo" },
    ctx: { locale: "fr-FR" },
  },
});

const run = await client.runWorkflow("enrich-ticket", {
  inputData: { contactId: "123" },
  metadata: { requestId: "run_abc" },
  runtime: {
    metadata: { tenant: "aidalinfo" },
    ctx: { locale: "fr-FR" },
  },
});
  • runtime / runtimeContext merge their metadata/ctx with the top-level payload fields.
  • resumeWorkflow remains available to unblock human-in-the-loop runs with { stepId, data }.
  • Pass a signal option to cancel the underlying fetch request.

Streaming & resume flow

  • /stream endpoints emit a first run event containing the runId, then forward each workflow event.type as a dedicated SSE event.
  • When a workflow ends with waiting_human, the stream closes after emitting the final payload. Resume the run later by POSTing { stepId, data } to /runs/:runId/resume.
  • If the client disconnects, the server cancels the run and frees resources.

Middleware

Reuse Mastra-style middleware syntax by passing functions (global) or { path, handler } tuples to the server.middleware option. path should be a string matching any Hono route pattern (wildcards like /api/* are fine). The legacy top-level middleware field is still recognized but will be removed in a future version.
const server = new ServerKit({
  agents: { support: supportAgent },
  workflows: { "enrich-ticket": enrichWorkflow },
  server: {
    middleware: [
      {
        path: "/api/*",
        handler: async (c, next) => {
          const authHeader = c.req.header("Authorization");
          if (!authHeader) {
            return new Response("Unauthorized", { status: 401 });
          }

          await next();
        },
      },
      async (c, next) => {
        console.log(`${c.req.method} ${c.req.url}`);
        await next();
      },
    ],
  },
});

Swagger / OpenAPI

Swagger is on by default outside production. Configure it via the swagger option:
const server = new ServerKit({
  agents: { support: supportAgent },
  swagger: {
    enabled: true,
    route: "/docs",
    title: "Support API",
    description: "AI Kit agents & workflows"
  }
});
  • UI served from route (e.g. /docs).
  • JSON spec available from route + ".json".
  • Pass false to disable it, or true to force-enable it even in production.

CLI entry point

The package bundles a server-kit binary (accessible via npx @ai_kit/server) that starts a bare ServerKit instance:
npx @ai_kit/server --swagger
Available env vars:
  • PORT (default 8787).
  • HOST (default 0.0.0.0).
  • NODE_ENV toggles the default Swagger behaviour.
The bundled CLI is intentionally lightweight: it does not register agents or workflows for you. Use it as a reference when wiring ServerKit inside your own API server (e.g. apps/api/server.ts) where you can pass the proper configuration object.

Configuration options

OptionTypePurpose
agentsRecord<string, Agent>Registers agents exposed under /api/agents/:id/*.
workflowsRecord<string, Workflow>Declares workflows exposed under /api/workflows/:id/*.
server.middlewareServerMiddleware[]Adds global or path-scoped Hono middleware before built-in routes.
swaggerboolean | SwaggerOptionsEnables Swagger or customizes route, title, version, description.
telemetryboolean | ServerTelemetryOptionsEnables Langfuse export and forwards options to ensureLangfuseTelemetry.
listen({ port, hostname, signal }) also accepts an AbortSignal so you can shut the server down gracefully. See packages/server/src/ServerKit.ts for the full implementation.

Langfuse telemetry

Toggle Langfuse directly from the ServerKit config — no extra bootstrap file is required:
const server = new ServerKit({
  agents: { support: supportAgent },
  workflows: { "enrich-ticket": enrichWorkflow },
  telemetry: {
    enabled: true,
  },
});
  • telemetry accepts either true/false or the full ensureLangfuseTelemetry options.
  • Provide LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, and optionally LANGFUSE_BASE_URL in the environment.
  • The CLI exposes matching --telemetry / --no-telemetry flags for quick toggles.