Limits & defaults

Batch-level options shared by every batch function — defaults merged into requests, batch metadata, and request/size limits with per-provider caps.

These options sit on the batch call alongside model and requests, and apply to every batch function — batch(), batch.images(), and batch.embeddings() — regardless of the per-request fields documented on each API page.

Defaults

defaults are merged into every request; request-level values win:

await batch({
  model: anthropic("claude-opus-4-8"),
  defaults: {
    system: "You are terse.",
    maxOutputTokens: 256,
  },
  requests: [
    { customId: "a", prompt: "…" }, // inherits system + maxOutputTokens
    { customId: "b", prompt: "…", maxOutputTokens: 1024 }, // overrides maxOutputTokens
  ],
});

Metadata

metadata attaches free-form key/value pairs to the batch. It's forwarded to OpenAI, Groq, Together AI, and Mistral, and ignored by Anthropic, Google Gemini, and xAI (whose batch APIs don't accept batch-level metadata):

await batch({ model, requests, metadata: { description: "nightly eval" } });

Limits

Batchwork validates every batch before it reaches a provider. The default guardrails are 50,000 requests per batch, 20 MiB per captured request body, a 200 MiB provider upload payload, and 16 concurrent request captures:

await batch({
  model,
  requests,
  limits: {
    captureConcurrency: 8,
    maxRequests: 10_000,
    maxRequestBytes: 4 * 1024 * 1024,
    maxUploadBytes: 100 * 1024 * 1024,
  },
});

Provider-side caps still apply, so keep custom limits at or below the target provider's maximums.

Per-batch limits vary by provider:

Provider	Max requests per batch	Max input size
OpenAI	50,000	200 MB
Anthropic	100,000	256 MB
Google Gemini	No fixed count	2 GB file · 20 MB inline
Groq	50,000	200 MB
Mistral	100,000	512 MB
Together AI	50,000	100 MB
xAI	50,000	200 MB

Notes:

Google Gemini doesn't cap request count directly — batches are bounded by the 2 GB input-file size and a per-model enqueued-token quota. Inline requests must stay under 20 MB total.
Together AI also limits enqueued tokens to 30B per model at any time.
xAI's 50,000 cap applies to file-based batches; inline batches are theoretically unbounded but throttled above ~1,000,000 requests.

For very large workloads, split into multiple batches.

Limits & defaults

Defaults

Metadata

Limits

On this page