Send chat completions with the Darkbloom API

POST /v1/chat/completions is the primary inference endpoint. It is fully compatible with the OpenAI Chat Completions API, so you can point any OpenAI SDK at Darkbloom by changing the base URL — no other code changes required. The endpoint supports both streaming (server-sent events) and non-streaming responses.

Request parameters

model

string

required

The model ID to use for this request. See Models for the full list of available IDs.

messages

object[]

required

The conversation history as an array of message objects. Each message has a role (system, user, or assistant) and a content string.

stream

boolean

default:"false"

When true, the response is delivered as a stream of server-sent events. Each event contains a delta chunk in the standard OpenAI SSE format, ending with data: [DONE].

max_tokens

number

default:"8192"

Maximum number of tokens to generate. If you do not set this, the coordinator injects a default of 8192 to bound the worst-case cost and ensure the pre-flight balance reservation covers the full generation.

temperature

number

default:"1.0"

Sampling temperature between 0 and 2. Lower values produce more deterministic output.

Examples

Basic request

curl https://api.darkbloom.dev/v1/chat/completions \
  -H "Authorization: Bearer eigeninference-your-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.5-27b-claude-opus-8bit",
    "messages": [
      {"role": "user", "content": "Explain gradient descent in one paragraph."}
    ]
  }'

Streaming

curl https://api.darkbloom.dev/v1/chat/completions \
  -H "Authorization: Bearer eigeninference-your-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.5-27b-claude-opus-8bit",
    "messages": [
      {"role": "user", "content": "Write a haiku about distributed systems."}
    ],
    "stream": true
  }'

With a system prompt

curl https://api.darkbloom.dev/v1/chat/completions \
  -H "Authorization: Bearer eigeninference-your-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mlx-community/Qwen3.5-122B-A10B-8bit",
    "messages": [
      {"role": "system", "content": "You are a concise technical assistant. Answer in bullet points only."},
      {"role": "user", "content": "What are the trade-offs of using SSE vs WebSockets?"}
    ],
    "max_tokens": 512,
    "temperature": 0.3
  }'

Response headers

Every response from the inference endpoint includes trust metadata headers that let you verify which provider handled the request and inspect their attestation status.

Header	Description
`x-provider-attested`	`true` if the provider passed the most recent attestation challenge
`x-provider-trust-level`	`self_signed` or `hardware` — see Trust levels
`x-provider-chip`	Apple Silicon chip model (e.g. `M3 Ultra`)
`x-provider-serial`	Provider machine serial number
`x-se-signature`	Secure Enclave signature over the response hash
`x-response-hash`	SHA-256 hash of the response body, signed by the Secure Enclave

You can use x-provider-serial to pin requests to a specific provider machine you have independently verified. Pass it as provider_serial in the request body and the coordinator will route only to that machine.

POST /v1/responses — OpenAI Responses API alias. The coordinator auto-detects the input vs messages field shape and routes to the same underlying handler. POST /v1/completions — Legacy text completions endpoint. Accepts a prompt string instead of a messages array. This is the older OpenAI /v1/completions format, not recommended for new integrations. POST /v1/messages — Anthropic Messages API compatible endpoint. Use this if you are working with the Anthropic Python or TypeScript SDK. See the note below.

To use the Anthropic SDK, set base_url="https://api.darkbloom.dev" (without /v1) and pass your Darkbloom API key as api_key. The Anthropic SDK appends /v1/messages automatically.

Thinking tags

Some models emit extended thinking wrapped in <think>...</think> tags. These tags are stripped from responses by default before the content reaches you. The final content field contains only the model’s visible output.

Get Started

Using the API

Become a Provider

Security & Trust

Billing

Send chat completions with the Darkbloom API

Request parameters

Examples

Basic request

Streaming

With a system prompt

Response headers

Thinking tags

Get Started

Using the API

Become a Provider

Security & Trust

Billing

Documentation Index

​Request parameters

​Examples

​Basic request

​Streaming

​With a system prompt

​Response headers

​Related endpoints

​Thinking tags

Request parameters

Examples

Basic request

Streaming

With a system prompt

Response headers

Related endpoints

Thinking tags