Every OpenAI-compatible client, configured

2026-06-17 · ~1500 words

Phantom is OpenAI-wire-compatible. Anything that already speaks https://api.openai.com/v1 works by switching base_url to https://phantom.codes/v1 and api_key to a Phantom key. This post is the copy-paste cheat sheet.

▸ The universal pattern

Every client below uses the same two values:

base URL: https://phantom.codes/v1
API key: the sk-... string you get back after payment

If a client takes those as environment variables, the canonical names are OPENAI_API_BASE (or OPENAI_BASE_URL in newer SDKs) and OPENAI_API_KEY. Set them once and most tools pick them up automatically.

▸ openai-python

from openai import OpenAI

client = OpenAI(
    base_url="https://phantom.codes/v1",
    api_key="sk-...",
)

resp = client.chat.completions.create(
    model="phantom/deepseek-v4-flash",
    messages=[{"role": "user", "content": "hello"}],
)
print(resp.choices[0].message.content)

Streaming is identical, set stream=True and iterate. Tool calling, function calling, vision, embeddings, image generation all forward unchanged.

▸ openai-node

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://phantom.codes/v1",
  apiKey: process.env.OPENAI_API_KEY,
});

const resp = await client.chat.completions.create({
  model: "phantom/deepseek-v4-flash",
  messages: [{ role: "user", content: "hello" }],
});
console.log(resp.choices[0].message.content);

▸ LangChain (Python)

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="phantom/deepseek-v4-flash",
    base_url="https://phantom.codes/v1",
    api_key="sk-...",
)

print(llm.invoke("hello").content)

LangGraph picks up the same configuration through the underlying LangChain LLM. No additional setup.

▸ Cline (VS Code)

Cline is an autonomous coding agent extension for VS Code. In Cline's settings:

API Provider: OpenAI Compatible
Base URL: https://phantom.codes/v1
API Key: sk-...
Model ID: phantom/deepseek-v4-flash (or any model from /models.html)

Pick a model with strong tool-calling support. For agentic coding, phantom/deepseek-v4-flash, phantom/deepseek-v4-pro, or the proxy-tier anthropic/claude-sonnet-5 work well.

▸ Aider

export OPENAI_API_BASE="https://phantom.codes/v1"
export OPENAI_API_KEY="sk-..."

aider --model openai/phantom/deepseek-v4-flash

Aider prefixes OpenAI-compatible models with openai/. The slash that follows is part of the Phantom model name.

▸ Continue (VS Code, JetBrains)

Continue's config.json (or config.yaml in newer versions) accepts an OpenAI-style provider block:

{
  "models": [
    {
      "title": "Phantom",
      "provider": "openai",
      "model": "phantom/deepseek-v4-flash",
      "apiBase": "https://phantom.codes/v1",
      "apiKey": "sk-..."
    }
  ]
}

▸ opencode

export OPENAI_API_BASE="https://phantom.codes/v1"
export OPENAI_API_KEY="sk-..."

opencode --model phantom/deepseek-v4-flash

▸ Crush

Crush (Charm's terminal coding agent) reads a crush.json in the project root, or ~/.config/crush/crush.json:

{
  "$schema": "https://charm.land/crush.json",
  "providers": {
    "phantom": {
      "type": "openai-compat",
      "base_url": "https://phantom.codes/v1",
      "api_key": "$OPENAI_API_KEY",
      "models": [
        {
          "id": "phantom/deepseek-v4-flash",
          "name": "DeepSeek V4 Flash",
          "context_window": 1048576,
          "default_max_tokens": 8192
        }
      ]
    }
  }
}

The type must be "openai-compat" — "openai" is reserved for OpenAI itself. Leave models empty and Crush pulls the whole catalog from /v1/models. Full setup for coding agents is at /code.html.

▸ OpenHands (formerly OpenDevin)

export LLM_API_KEY="sk-..."
export LLM_BASE_URL="https://phantom.codes/v1"
export LLM_MODEL="openai/phantom/deepseek-v4-flash"

docker run -e LLM_API_KEY -e LLM_BASE_URL -e LLM_MODEL \
  -v $PWD:/workspace docker.all-hands.dev/all-hands-ai/openhands

OpenHands uses LiteLLM internally. The openai/ prefix routes through the OpenAI-compatible adapter.

▸ OpenWebUI

In OpenWebUI, go to Settings → Connections → OpenAI API:

API Base URL: https://phantom.codes/v1
API Key: sk-...

Models pull automatically from /v1/models. Pick one from the dropdown in chat. The Phantom catalog includes vision models, so OpenWebUI's image upload works against models that support it.

▸ LiteLLM

import litellm

resp = litellm.completion(
    model="openai/phantom/deepseek-v4-flash",
    api_base="https://phantom.codes/v1",
    api_key="sk-...",
    messages=[{"role": "user", "content": "hello"}],
)
print(resp.choices[0].message.content)

LiteLLM's proxy server config (litellm_config.yaml) uses the same fields under model_list[].litellm_params.

▸ Cursor

Cursor's API Keys settings allow an OpenAI override:

Go to Settings → Models → OpenAI API Key
Enable Override OpenAI Base URL
Base URL: https://phantom.codes/v1
API Key: sk-...

Cursor's hardcoded model list will not show Phantom-named models, so you may need to use a configured custom model or pick a model whose ID happens to match.

▸ Anything else with OPENAI_API_BASE

For any tool that respects the OpenAI SDK environment variables, the universal recipe is:

export OPENAI_API_BASE="https://phantom.codes/v1"
export OPENAI_BASE_URL="https://phantom.codes/v1"
export OPENAI_API_KEY="sk-..."

Set both OPENAI_API_BASE and OPENAI_BASE_URL because different SDK versions read different variable names. Setting both is harmless.

▸ Model naming

Phantom models are namespaced as phantom/<name>. Some clients require a vendor prefix in front of that (Aider, OpenHands, LiteLLM use openai/). Most don't.

The full catalog with live pricing is at /models.html or via GET /v1/models. The catalog includes:

TEE-attested open-weight: DeepSeek V4 Flash, DeepSeek V4 Pro, Llama 3.3 70B, Mistral, Gemma, and many more
Closed-weight proxy: Claude Sonnet 5, GPT-5.5, Gemini 2.5 Pro, Grok 4
Embeddings: multiple dimensions and models
Image generation: SDXL, Flux, and proxy-tier image models
Vision: models supporting image_url content parts

▸ What works and what doesn't

What works unchanged: chat completions, streaming, tool calling, function calling, vision (image_url content parts), embeddings, image generation, JSON mode (where the upstream supports it).

What's deliberately dropped at the proxy: user, metadata, logit_bias, logprobs, top_logprobs, seed. These are fingerprint-rich and not necessary for inference. The body whitelist is documented at /docs.html#endpoints.

What's not supported: OpenAI Assistants API (stateful threads, file storage on OpenAI's side), ChatGPT plugins, vendor-specific extensions outside the chat-completions surface.

▸ Verifying it works

Smoke test from a terminal:

curl https://phantom.codes/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"phantom/deepseek-v4-flash","messages":[{"role":"user","content":"hi"}]}'

If you see a JSON response with choices[0].message.content, the integration is working. Per-request attestation is at /docs.html#attestation.

Remaining balance: curl -H "Authorization: Bearer $OPENAI_API_KEY" https://phantom.codes/v1/key/balance. Returns the remaining micro-USD credit.