// DOCUMENTATION

PHANTOM API

OpenAI-compatible. Anonymous payment. Hardware-attested inference.

// CONNECTION

Base URL https://api.phantom.codes/v1
Tor (.onion) http://jzqvbfmrlt5ye467joz75dg6xdurc6bniozautqlil5b3tbf777zmsid.onion/v1
Auth header Authorization: Bearer sk-...
Get a key Buy credit at phantom.codes. Pay XMR, receive sk-... once.
Compatibility Drop-in OpenAI replacement. Point any OpenAI SDK at the Base URL above.

// 01

Quick start

▸ curl

$ curl https://api.phantom.codes/v1/chat/completions \
    -H "Authorization: Bearer YOUR_PHANTOM_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "phala/qwen-2.5-7b-instruct",
      "messages": [{"role":"user","content":"Hello"}]
    }'

▸ Python (openai client)

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_PHANTOM_KEY",
    base_url="https://api.phantom.codes/v1",
)

r = client.chat.completions.create(
    model="phala/kimi-k2.6",
    messages=[{"role": "user", "content": "Hello"}],
)
print(r.choices[0].message.content)

▸ Node (openai client)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_PHANTOM_KEY",
  baseURL: "https://api.phantom.codes/v1",
});

const r = await client.chat.completions.create({
  model: "phala/gpt-oss-120b",
  messages: [{ role: "user", content: "Hello" }],
});
console.log(r.choices[0].message.content);

// 02

Don't have XMR?

Phantom takes Monero only. Four paths, pick by what you start with and how much friction you'll absorb.

▸ Get a wallet first

  • Cake Wallet (iOS/Android). Easiest. Built-in swap. Tor option.
  • Feather Wallet (desktop). Lightweight, Tor-first. Recommended for Phantom users.
  • Monero GUI (Windows/Mac/Linux). Official, full node. Power users.
  • Monerujo (Android). Mobile alternative to Cake.

▸ Path A. Have crypto already (BTC, USDT, ETH, etc.)

Swap to XMR. No account, no KYC, no signup.

  • Trocador. Aggregator over 10+ providers. Best rate auto-routed. KYC + log rating per provider. Default recommendation.
  • FixedFloat. Direct. 0.5% floating / 1% fixed.
  • MajesticBank. No-KYC, often best on small swaps. Onion mirror.
  • eXch. A-rated no-KYC. Tor by default.

Privacy on the source coin is your problem. BTC swaps leak the BTC side. Use Tor and a non-KYC source if your threat model requires it.

▸ Path B. Have a gift card (Amazon, Steam, Visa, etc.)

Convert your card to XMR on a P2P market. Phantom doesn't take gift cards directly. Too much fraud risk; we'd compromise our brand.

  • Easy mode: RoboSats. Trade gift card → BTC (Lightning) over Tor browser. No install. Then use Path A above (Trocador) to swap BTC → XMR.
  • Maximalist mode: RetoSwap. Trade gift card → XMR directly. Open source software you run locally. No third-party swap in between.

Both use escrow. Both handle fraud disputes. Phantom never sees your card.

▸ Path C. Have only fiat, no crypto yet

Buy XMR with cash, card, or bank transfer.

  • Cake Wallet built-in buy. Card or Apple Pay via routed providers. KYC by the buy provider, not Phantom. Fastest for normies.
  • RetoSwap. P2P with cash by mail, SEPA, ACH, in-person. No KYC. Slower but ID-free.
  • Kraken. KYC heavy, well-known. Buy XMR, withdraw to your wallet.

If you use a centralized exchange, withdraw immediately. Never leave XMR sitting on an exchange. They can freeze, lose, or be subpoenaed.

▸ Path D. Mine it

Slow, but the only path with zero counterparty. CPU-mineable. Run Monero GUI or join a pool with Monerujo.

// FAIR WARNING

These are third-party services. Their privacy policies are theirs, not ours. Their domains may change. Verify before sending funds. We list them because they're the best options we know of today; we cannot vouch for them tomorrow. No affiliate links here.

First time? Cake Wallet + Path A (in-app swap) gets you to Phantom in ~10 minutes from zero.


// 03

Endpoints

METHODPATHAUTHPURPOSE
POST/v1/chat/completionsBearerOpenAI-compatible chat. Streams.
POST/v1/embeddingsBearerOpenAI-compatible embeddings.
POST/v1/images/generationsBearerImage generation (DALL-E, Stable Diffusion, Recraft).
GET/v1/modelsnoneList models + pricing (markup included).
GET/v1/bundlesnoneList credit bundles.
POST/v1/purchasenoneBody: {"bundle":"small"} OR {"amount_usd":7.5}.
GET/v1/purchase/{id}/statusnonePoll payment. Returns plaintext key once on completion.
GET/v1/purchase/{id}/qr.svgnoneQR code (SVG) for the payment URI.
GET/v1/key/balanceBearerRemaining credit + expiry.
POST/v1/key/rotateBearerIssue new key, carry credit, deactivate old.
GET/v1/signature/{id}BearerPer-response signature (step 2 of verification).
GET/v1/inference-attestBearerPhala TDX + NVIDIA CC attestation report (step 3).
GET/healthnoneLiveness.

// 04

Purchase flow

Send POST /v1/purchase with either a bundle name or a custom USD amount.

$ curl -X POST https://api.phantom.codes/v1/purchase \
    -H "Content-Type: application/json" \
    -d '{"amount_usd": 25}'

You receive:

{
  "payment_id": "abc...",
  "xmr_address": "75z...",
  "xmr_amount":  "0.0645...",
  "bundle":      "custom",
  "credit_usd":  25,
  "expires_at":  "2026-05-19T06:00:00+00:00"
}

Send XMR. Poll /v1/purchase/{id}/status. States:

pending confirming ready completed

Key drops in response body when status flips to completed. Shown ONCE. Recovery is not possible.


// 05

Key rotation

If you suspect your key is compromised, rotate. Credit + expiry transfer to the new key. Old key deactivates.

$ curl -X POST https://api.phantom.codes/v1/key/rotate \
    -H "Authorization: Bearer OLD_KEY"
# → {"api_key": "sk-new...", "shown_once": true}

// 06

Streaming

Pass "stream": true. SSE-compatible. Final chunk includes a usage block used for billing. Aborting mid-stream is billed for tokens already generated.


// 07

Vision

Only phala/qwen3-vl-30b-a3b-instruct accepts images. Pass content as an array with image_url parts:

{
  "model": "phala/qwen3-vl-30b-a3b-instruct",
  "messages": [{
    "role": "user",
    "content": [
      {"type": "text", "text": "What's in this image?"},
      {"type": "image_url", "image_url": {"url": "https://example.com/cat.jpg"}}
    ]
  }]
}

// 08

Image generation

OpenAI-compatible image API. Flat-rate billing per image. No tokens. Generated URLs from upstream live for ~1 hour — download them.

▸ curl

$ curl https://api.phantom.codes/v1/images/generations \
    -H "Authorization: Bearer YOUR_PHANTOM_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "stability/stable-diffusion-3-5-large",
      "prompt": "A misty cyberpunk alley at dawn, anamorphic lens, 35mm film",
      "n": 1,
      "size": "1024x1024",
      "quality": "standard"
    }'

▸ Python (openai client)

from openai import OpenAI

client = OpenAI(api_key="YOUR_PHANTOM_KEY", base_url="https://api.phantom.codes/v1")

img = client.images.generate(
    model="stability/stable-diffusion-3-5-large",
    prompt="A serene mountain at sunset with aurora borealis",
    n=1,
    size="1024x1024",
    quality="standard",
)
print(img.data[0].url)

▸ Parameters

FIELDTYPENOTES
modelstringOne of the image models in /v1/models (kind: image).
promptstringRequired. Max 4000 chars.
nintegerImages to generate. 1–10. Charged per image.
sizestring256x256, 512x512, 1024x1024, 1536x1536, 2048x2048. Bounded by model's max_size.
qualitystringstandard or hd. Affects price.
response_formatstringurl (default) or b64_json.
stylestringModel-specific (e.g. vivid / natural for DALL-E).
negative_promptstringModel-specific. Stability + Recraft honor this.

▸ Pricing

Flat per image at the requested quality. Marked up over the vendor's wholesale rate. Live numbers come from GET /v1/models under image_pricing_usd_user. Indicative:

MODELSTANDARDHD
segmind/sd3-turbo~$0.02~$0.03
openai/dall-e-3~$0.06~$0.12
stability/stable-diffusion-3-5-medium~$0.05~$0.09
stability/stable-diffusion-3-5-large~$0.06~$0.12
stability/stable-diffusion-ultra~$0.12~$0.18

// PRIVACY NOTE

All current image models are tier PROXY — gateway runs in TDX but the model itself runs on the vendor's normal infrastructure (Stability, OpenAI, Recraft, Segmind). They see your prompt content. Phantom hides only your identity. Don't generate images of anything you'd be unhappy publishing.


// 09

Attestation

Phala provides two cryptographic layers, exposed through three endpoints:

  • Per-response signature: ECDSA signature bound to a specific chat completion id.
  • Per-model attestation report: Intel TDX (CPU) quote + NVIDIA Confidential Computing (GPU) attestation, tied to the signing address that signed your response.

▸ Three-step verification

Bind each individual inference to verified TEE hardware:

Step 1. Make any chat completion. Capture response.id.

$ curl https://api.phantom.codes/v1/chat/completions \
    -H "Authorization: Bearer YOUR_KEY" \
    -d '{"model":"phala/kimi-k2.6","messages":[{"role":"user","content":"hi"}]}'
# → { "id": "chatcmpl-abc123...", ... }

Step 2. Fetch the signature for that specific response id.

$ curl "https://api.phantom.codes/v1/signature/chatcmpl-abc123...?model=phala/kimi-k2.6" \
    -H "Authorization: Bearer YOUR_KEY"
# → { "text": "...", "signature": "0x...", "signing_address": "0x56d0...8b" }

Step 3. Fetch the attestation report. Bind it to the same signing_address and a fresh nonce of your choosing.

$ NONCE=$(openssl rand -hex 32)
$ curl "https://api.phantom.codes/v1/inference-attest?model=phala/kimi-k2.6&nonce=$NONCE&signing_address=0x56d0...8b" \
    -H "Authorization: Bearer YOUR_KEY"
# → { intel_quote, nvidia_payload, signing_address, nonce, event_log, ... }

▸ Offline verification

Verify the Intel TDX quote (CPU):

POST https://cloud-api.phala.com/api/v1/attestations/verify  # with intel_quote

Verify the NVIDIA Confidential Computing payload (GPU):

POST https://nras.attestation.nvidia.com/v3/attest/gpu  # with nvidia_payload

Verify the signature locally: recover the public key from signature over the canonical text field, confirm it matches signing_address. The attestation report binds signing_address to the genuine TEE that produced it. Together: response → signature → TEE quote chain.


// 10

Paste-to-agent

Copy any block below into your AI agent's config. Replace sk-your-phantom-key with the key you get after purchase. Every block on this page has a copy button — click it.

▸ Shell env (works with most agent SDKs)

LangChain, LlamaIndex, Vercel AI SDK, Pydantic-AI, llm, mods, shell-gpt, aichat, chatblade, tgpt — anything that respects OpenAI env.

export OPENAI_BASE_URL=https://api.phantom.codes/v1
export OPENAI_API_KEY=sk-your-phantom-key

▸ Claude Code / Cursor / Cline (OpenAI-compatible providers)

In each tool's settings UI pick "OpenAI Compatible" and paste:

base_url:  https://api.phantom.codes/v1
api_key:   sk-your-phantom-key
model:     phala/kimi-k2.6

▸ opencode (sst) — ~/.config/opencode/opencode.json

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "phantom": {
      "npm": "@ai-sdk/openai-compatible",
      "options": {
        "baseURL": "https://api.phantom.codes/v1",
        "apiKey": "{env:PHANTOM_API_KEY}"
      },
      "models": {
        "phala/kimi-k2.6":     {"name": "Kimi K2.6"},
        "phala/glm-5.1":       {"name": "GLM 5.1"},
        "phala/gpt-oss-120b":  {"name": "GPT-OSS 120B"},
        "phala/glm-4.7-flash": {"name": "GLM 4.7 Flash"}
      }
    }
  }
}

▸ Continue.dev — ~/.continue/config.json

{
  "models": [{
    "title":    "Phantom Kimi",
    "provider": "openai",
    "model":    "phala/kimi-k2.6",
    "apiKey":   "sk-your-phantom-key",
    "apiBase":  "https://api.phantom.codes/v1"
  }]
}

▸ Aider

aider --openai-api-base https://api.phantom.codes/v1 \
      --openai-api-key  sk-your-phantom-key \
      --model           openai/phala/kimi-k2.6

▸ OpenHands

export LLM_API_KEY=sk-your-phantom-key
export LLM_BASE_URL=https://api.phantom.codes/v1
export LLM_MODEL=openai/phala/kimi-k2.6
docker run -it ghcr.io/all-hands-ai/openhands:main

▸ Drop-in for an LLM prompt

Paste this into your LLM and tell it "configure my project to use phantom":

Use phantom (phantom.codes) as my LLM provider.

It is OpenAI-API-compatible. Set:
  base_url = https://api.phantom.codes/v1
  api_key  = sk-your-phantom-key  (provided separately)

Pick a model from: https://api.phantom.codes/v1/models
Defaults that work well:
  phala/kimi-k2.6        - long-horizon coding agent
  phala/glm-5.1          - premium reasoning + coding
  phala/gpt-oss-120b     - balanced default
  phala/glm-4.7-flash    - fast agentic coding, 202K ctx

For image generation use POST /v1/images/generations with model
stability/stable-diffusion-3-5-large or openai/dall-e-3.

For embeddings use POST /v1/embeddings with model
qwen/qwen3-embedding-8b.

Wire the SDK exactly like OpenAI (same shape, same auth header).

// 11

Integrations

Phantom is OpenAI-compatible. Works with anything that takes baseUrl + apiKey.

▸ Any OpenAI-compatible CLI

Two env vars in your shell config (~/.zshrc or ~/.bashrc):

$ export OPENAI_BASE_URL=https://api.phantom.codes/v1
$ export OPENAI_API_KEY=sk-your-phantom-key

Works with: aichat, llm (simonw), mods (charm), chatblade, shell-gpt, tgpt, ollama (via --openai compat), chatgpt-cli, plus any agent SDK that reads OPENAI_BASE_URL (LangChain, LlamaIndex, Vercel AI SDK, Pydantic-AI). Set the model name in each tool's config to one from /v1/models, e.g. phala/kimi-k2.6.

Wrappers around Moonshot's Kimi or other proprietary CLIs that hardcode their vendor URL won't work. They bypass your env. Use any of the above generic CLIs and select the phala/kimi-k2.6 model instead.

▸ opencode (sst)

~/.config/opencode/opencode.json:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "phantom": {
      "npm": "@ai-sdk/openai-compatible",
      "options": {
        "baseURL": "https://api.phantom.codes/v1",
        "apiKey": "{env:PHANTOM_API_KEY}"
      },
      "models": {
        "phala/kimi-k2.6":    {"name": "Kimi K2.6"},
        "phala/glm-5.1":      {"name": "GLM 5.1"},
        "phala/gpt-oss-120b": {"name": "GPT-OSS 120B"},
        "phala/glm-4.7-flash":{"name": "GLM 4.7 Flash"}
      }
    }
  }
}

▸ OpenHands

$ export LLM_API_KEY=sk-your-key
$ export LLM_BASE_URL=https://api.phantom.codes/v1
$ export LLM_MODEL=openai/phala/kimi-k2.6
$ docker run -it ghcr.io/all-hands-ai/openhands:main

▸ Aider

$ aider --openai-api-base https://api.phantom.codes/v1 \
        --openai-api-key sk-your-key \
        --model openai/phala/kimi-k2.6

▸ Cline (VS Code)

Settings → API Provider: OpenAI Compatible
Base URL: https://api.phantom.codes/v1
API Key: your sk-...
Model: phala/kimi-k2.6 (or any from /v1/models).

▸ Continue.dev

~/.continue/config.json:

{
  "models": [{
    "title": "Phantom Kimi",
    "provider": "openai",
    "model": "phala/kimi-k2.6",
    "apiKey": "sk-your-key",
    "apiBase": "https://api.phantom.codes/v1"
  }]
}

▸ OpenClaw

~/.openclaw/openclaw.json:

{
  models: {
    mode: "merge",
    providers: {
      "phantom": {
        baseUrl: "https://api.phantom.codes/v1",
        apiKey: "${PHANTOM_API_KEY}",
        api: "openai-completions",
        models: [
          { id: "phala/kimi-k2.6",      contextWindow: 262144, maxTokens: 32000 },
          { id: "phala/glm-5.1",        contextWindow: 202752, maxTokens: 32000 },
          { id: "phala/gpt-oss-120b",   contextWindow: 131072, maxTokens: 32000 },
          { id: "phala/glm-4.7-flash",  contextWindow: 202752, maxTokens: 32000 }
        ]
      }
    }
  }
}

▸ OpenWebUI

Settings → Connections → OpenAI API
URL: https://api.phantom.codes/v1
Key: your sk-...

// FUNCTION-CALLING NOTE

Agentic tools (opencode, Cline, OpenHands) lean on tool/function calling. Pick a capable model:

  • Strong: phala/kimi-k2.6, phala/glm-5.1
  • Decent: phala/glm-4.7-flash, phala/gpt-oss-120b
  • Chat-only: phala/qwen-2.5-7b-instruct, phala/uncensored-24b

// 12

Rate limits

ENDPOINTLIMITKEYED BY
/v1/chat/completions60/minuteAPI key hash
/v1/purchase10/minuteIP hash (never logged)
/v1/purchase/{id}/status60/minuteIP hash
/v1/key/rotate5/hourAPI key hash

// 13

FAQ

▸ I lost my key. Recovery?

None. Keys are bearer. We store only the SHA-256 hash. Lose it, credit is gone. By design.

▸ Refunds?

Crypto refunds need a return address we don't store. Buy small first.

▸ Why no /v1/usage history endpoint?

We log token counts + key hash, not content. Exposing usage history as an endpoint lets attackers enumerate when a key is active. /v1/key/balance gives you the only thing you need: remaining credit.

▸ What models do you support?

See /v1/models. All TEE-attested on Phala. Proprietary models (Claude, GPT-4, Gemini, Grok) are not exposed. They don't run in TEEs and they log queries.

▸ Can you see my prompts?

While a request is in flight, the proxy holds the prompt in RAM. We don't log it, and it's not persisted. The hosting provider could in principle dump memory. See trust boundaries.

▸ Tor?

The API runs as a Tor hidden service. See the .onion address in the connection callout at the top of this page. Same API, same keys, no clearnet exit, no DNS lookup. You can also route any client through Tor SOCKS at the clearnet URL.

▸ How long does payment take to confirm?

Monero blocks land roughly every 2 minutes. Required confirmations scale with bundle size:

  • small ($10): 2 confirmations (~4 min)
  • medium ($50): 4 confirmations (~8 min)
  • large ($200): 6 confirmations (~12 min)
  • whale ($500+): 10 confirmations (~20 min)

Custom amounts use the same tier rule, derived from credit size. After confirmation, the key drops on first status poll.

▸ Why Monero only? Why not BTC / ETH / USDC?

Bitcoin and Ethereum transactions are public. Sending us BTC reveals to anyone with the txid that wallet X paid Phantom on day Y. Monero hides sender, receiver, and amount on-chain. Stablecoins (USDC/USDT) have a freeze function that gives the issuer the power to seize. Phantom's premise is operator privacy + customer privacy, so we use the only payment rail that delivers both.

▸ Is this legal?

Phantom forwards inference requests to TEE-attested models. The models are open-weight. Monero is legal in nearly all jurisdictions (banned in a handful: China, some EU exchanges restrict, but possession not criminalized). Your use of the inference output is your responsibility per the AUP. Phantom does not advise on legality in your jurisdiction.

▸ How does this compare to OpenAI / Anthropic / Phala direct?

vendor account payment logs prompts TEE
OpenAI / Anthropic / Googlerequired + KYCcredit cardyes (30d default)no
Phala / Redpill directemail accountcard / cryptonoyes
PhantomnoneXMR onlynoyes

Phantom adds a ~30% markup vs Phala direct in exchange for the no-account, no-card, no-IP-log layer. If you don't need anonymity, use Phala direct.

▸ What happens when my credit hits $0?

The API returns HTTP 402 "insufficient credit or expired key". Buy a new bundle to get a fresh key, OR keep the same key by purchasing more credit (the key id stays. pending mid-stream behavior may differ; safest is rotate). Lost credit at expiration is forfeit.

▸ Encrypted contact / abuse reports?

Encrypted only. PGP public key at /pgp.txt (fingerprint 09654A79076956E6042D11946296DEC4E954FC76). We don't run a plaintext support channel.