Testbeds

Testbeds turn the manifest into executable protocol checks.

Anvil already knows routes, params, bindings, validation, middleware, and response shapes from the compiler pass. Testbeds reuse that data so API tests can stay close to the generated contract.

anvil testbed .

Without --base-url, the command prints a deterministic JSON suite. With --base-url, it executes supported cases against an already-running server and prints a deterministic report.

anvil testbed . --base-url http://127.0.0.1:8080

The output schema for generated suites is anvil.testbed.v1. The output schema for execution reports is anvil.testbed.report.v1.

The CLI always compiles the manifest first. It then builds the suite, appends deterministic stress cases when --stress is set, applies configured testbed.scripts, applies extra --script files in CLI order, and either prints the suite or runs the supported network cases. If any executed case fails, the command prints the report and exits non-zero. Skipped cases are counted as skipped, not failed. Embedded tooling can use the runtime API when it needs to keep and marshal failed-case reports directly.

Network execution in the Anvil CLI is intentionally narrow: it runs normal HTTP route cases first, then GraphQL endpoint cases. WebSocket route cases are reported as skipped unless an embedded caller supplies a WebSocket runner. gRPC and queue execution need driver-owned runners because those protocols need client libraries, descriptors, streaming clients, socket handling, or broker setup that Anvil does not import.

HTTP Testbeds

For each HTTP route, Anvil generates one generated shape case. The case uses sample values derived from route params, request bindings, body fields, and validation tags.

HTTP cases can exercise:

Path params.
Query params.
Headers.
Request bodies.
Expected response status.
Response body assertions.
Streaming response assertions.
WebSocket scripts.

Default success status:

WebSocket route: 101.
POST: 201.
Other HTTP methods: 200.

When running over the network, Anvil executes normal HTTP cases and scripted GraphQL cases. WebSocket cases are skipped by the CLI runner with a clear report entry. Driver modules can provide a WebSocket runner through the runtime API when a test needs socket messages, close codes, and subprotocols.

The network HTTP runner sends JSON bodies only when a case has body fields. A nil body map sends an empty request body and no automatic content type. A non-nil body map receives Content-Type: application/json. If that map is empty, the request body is still empty; otherwise it is JSON-encoded.

If --base-url includes a path, that path is kept as a prefix. For example, --base-url http://127.0.0.1:8080/api and a case path of /v1/projects execute against /api/v1/projects.

Response bodies are capped during execution. The default cap is 1 MiB; custom embedded runners can raise it up to 1 GiB.

When a route has scripted cases, the network runner uses those scripted cases instead of the generated case for that route. Pass --run-generated on the CLI, or set IncludeGenerated in the runtime API, when embedded tooling needs both scripted and generated cases.

locals can be generated or scripted as part of the suite model, but the network HTTP runner cannot inject local values into a real HTTP request. Use headers, params, query values, and bodies for network assertions. Driver-owned or in-process runners can choose to use locals.

GraphQL Testbeds

GraphQL testbeds describe each endpoint path, handler, source location, and whether the endpoint has a subscription handler. Anvil does not invent a GraphQL query from a resolver type; GraphQL execution cases come from scripts.

GraphQL scripts can add exact operations:

graphql:
  - name: project-query
    endpoint: /graphql
    query: |
      query Project($id: ID!) {
        project(id: $id) { id name }
      }
    variables:
      id: proj_000001
    expect:
      status: 200
      bodyContains:
        - project

Subscription scripts set subscription: true. The runner sends subscription requests as GET with Accept: text/event-stream. Query, operation name, and variables are encoded as query parameters for subscription requests.

Normal GraphQL scripts run as POST JSON requests. GraphQL expectations check HTTP status and body fragments. The built-in runner does not parse GraphQL response objects; protocol-specific assertion depth belongs in richer runner tooling.

gRPC Testbeds

gRPC testbeds describe service name, method name, stream kind, request type, response type, and source location. Anvil does not generate default gRPC messages. Scripted gRPC cases are JSON-shaped so driver-owned runners can map them through protobuf descriptors.

grpc:
  - name: get-project
    service: ProjectService
    method: GetProject
    message:
      projectId: proj_000001
    expect:
      code: OK
      message:
        id: proj_000001

Anvil does not execute gRPC cases directly. gRPC execution belongs in the gRPC driver package because protobuf descriptors and streaming clients are driver-owned.

Queue Testbeds

Queue testbeds describe job name, queue name, handler, and source location. Anvil does not generate default broker messages. Queue scripts add exact messages:

queue:
  - name: project-created
    queue: projects
    job: ProjectCreated
    id: msg_001
    body: '{"projectId":"proj_000001"}'
    headers:
      X-Trace-ID: trace_001
    attempt: 1
    expect:
      error: false

Broker-backed runners remain driver-owned because Redis, NATS, SQS, and memory queues have different setup and delivery behavior.

Deterministic Stress

Stress generation appends valid-by-construction HTTP cases:

anvil testbed . --stress 25 --seed 42

The generator creates stress-001, stress-002, and so on for each HTTP route. Seeds are mixed with the route operation ID and case index, so the same seed and source produce the same cases.

Stress input generation understands common validations such as uuid, email, oneof, numeric bounds, string length bounds, and simple anchored regex classes such as ^[a-z0-9]+$.

The regex support is intentionally small. Stress generation does not execute arbitrary regular expressions. It recognizes only anchored character-class patterns with + or *, expands the class, and builds a string from that alphabet. Other regex patterns fall back to the normal string generator.

Stress generation creates deterministic valid inputs. It is not a security fuzzer. It does not generate invalid payloads, oversized payloads, or mutation-style cases.

Scripted Cases

Testbeds can be configured with YAML scripts for cases that need exact inputs:

http:
  - name: create-project
    operation: Projects.Create
    body:
      name: Launch Plan
    expect:
      status: 201
      headers:
        Location: /api/v1/projects/proj_000001
      bodyContains:
        - Launch Plan

Generated cases and scripted cases are complementary. Generated cases explore the shape of the API; scripts lock down known workflows.

Generated output order is stable. HTTP routes are sorted by operation ID, gRPC methods by service and method, GraphQL endpoints by path, and queue jobs by queue and job name.

Script files support these top-level sections:

http: Normal HTTP route cases by operation ID.
websocket: WebSocket route cases by operation ID.
graphql: GraphQL cases by endpoint name or endpoint path.
grpc: gRPC cases by service and method.
queue: Queue cases by queue and job.

WebSocket scripts are separate because they need message input and close expectations:

websocket:
  - name: project-socket
    operation: Projects.Socket
    params:
      projectId: proj_000001
    input:
      messages:
        - type: text
          data: ping
    expect:
      messages:
        - type: text
          data: pong
      closeCode: 1000

The protocol-neutral model accepts text, binary, close, ping, and pong as message type names so driver-owned runners can decide what they support. The official HTTP runners send text, binary, ping, and close input messages. They assert text and binary output messages and compare the close code when one is configured. Ping and pong frames are sent for protocol coverage but are not asserted as output frames.

Script validation happens before execution. Unknown HTTP operations, unknown gRPC methods, unknown GraphQL endpoints, unknown queue jobs, invalid HTTP statuses, invalid WebSocket message types, and unknown HTTP param names fail the testbed command with a source-specific error.

HTTP script validation checks route params because those must match generated path metadata. Query keys, request header keys, locals, and body keys are not rejected by the script validator; they are request data that can intentionally sit outside the generated route shape. Expected response header names must be non-empty, and each bodyContains or streamContains value must be non-empty.

Scripts from anvil.yaml are resolved against the config file directory. Extra --script paths are used exactly as passed to the CLI.

Read Testbed Runtime API when writing a driver-owned runner or embedding testbed execution in custom tooling.