Requests

Each batch request mirrors the AI SDK generateText input, plus a customId to correlate its result.

Request shape

Each request mirrors generateText (minus model), plus a customId used to correlate its result. Provide prompt or messages.

await batch({
  model: anthropic("claude-opus-4-8"),
  requests: [
    {
      customId: "doc-1",
      prompt: "Summarize…",
      temperature: 0,
    },
    {
      customId: "doc-2",
      system: "You are terse.",
      messages: [{ role: "user", content: "Translate…" }],
      maxOutputTokens: 256,
    },
  ],
});

Supported fields

FieldNotes
customIdCorrelates the result. Auto-generated as request-<index> if omitted; must be unique within a batch.
prompt / messagesProvide one. messages is the AI SDK ModelMessage[].
systemSystem prompt.
tools / toolChoiceTool definitions, exactly as in generateText.
maxOutputTokensMaximum tokens to generate. Required by Anthropic.
temperature, topP, topKSampling controls.
presencePenalty, frequencyPenaltyPenalties.
stopSequences, seedStop sequences and seed.
providerOptionsProvider-specific options (e.g. reasoning / thinking config).

Batch-wide defaults

defaults are merged into every request; request-level values win:

await batch({
  model: anthropic("claude-opus-4-8"),
  defaults: {
    system: "You are terse.",
    maxOutputTokens: 256,
  },
  requests: [
    { customId: "a", prompt: "…" }, // inherits system + maxOutputTokens
    { customId: "b", prompt: "…", maxOutputTokens: 1024 }, // overrides maxOutputTokens
  ],
});

Metadata

metadata attaches free-form key/value pairs to the batch. It's forwarded to OpenAI, Groq, Together AI, and Mistral, and ignored by Anthropic, Google Gemini, and xAI (whose batch APIs don't accept batch-level metadata):

await batch({ model, requests, metadata: { description: "nightly eval" } });

Limits

Batchwork validates every batch before it reaches a provider. The default guardrails are 50,000 requests per batch, 20 MiB per captured request body, a 200 MiB provider upload payload, and 16 concurrent request captures:

await batch({
  model,
  requests,
  limits: {
    captureConcurrency: 8,
    maxRequests: 10_000,
    maxRequestBytes: 4 * 1024 * 1024,
    maxUploadBytes: 100 * 1024 * 1024,
  },
});

Provider-side caps still apply, so keep custom limits at or below the target provider's maximums.

Per-batch limits vary by provider:

ProviderMax requests per batchMax input size
OpenAI50,000200 MB
Anthropic100,000256 MB
Google GeminiNo fixed count2 GB file · 20 MB inline
Groq50,000200 MB
Mistral100,000512 MB
Together AI50,000100 MB
xAI50,000200 MB

Notes:

  • Google Gemini doesn't cap request count directly — batches are bounded by the 2 GB input-file size and a per-model enqueued-token quota. Inline requests must stay under 20 MB total.
  • Together AI also limits enqueued tokens to 30B per model at any time.
  • xAI's 50,000 cap applies to file-based batches; inline batches are theoretically unbounded but throttled above ~1,000,000 requests.

For very large workloads, split into multiple batches.

On this page